Netwide Assembler tutorial

Embed Size (px)

DESCRIPTION

Netwide Assembler tutorial

Citation preview

  • Programming Intel i386Assembly with NASM

    Yorick Hardy

    International School for Scientific Computing

    1

  • NASM

    NASM can be used in combinations of the following

    With C/C++ to define functions that can be used byC/C++

    To construct a program in assembly language only

    To construct a program in assembly language which callsC functions

    The mechanisms to do this depend on the compiler.

    We will consider the GNU Compiler Collection (GCC).

    2

  • NASM: C calling conventions

    Parameters are passed on the stack. Functions are called us-ing the call instruction which pushes the return address onthe stack before jumping to the function pointer.

    Functions return to the stored address by popping the addressoff the stack and jumping to the return address using the retinstruction.

    The return value is usually assumed to be in eax.

    ...call zero_eax ; push the return address on the stack...

    zero_eax:xor eax, eaxret ; the return address is in [esp]

    3

  • NASM: C calling conventions

    When using GCC we consider cdecl, stdcall and other callingconventions.

    cdecl: Parameters are pushed onto the stack in right toleft order before the function call, and removed in leftto right order after the function call. It is the callersresponsibility to remove the parameters from the stackafter a function call.

    stdcall: Parameters are pushed onto the stack in right toleft order before the function call, and removed in left toright order before returning from function call (unless thefunction takes a variable number of arguments). It is thecallees responsibility to remove the parameters from thestack after a function call.

    other: A combination of stack, registers and memory ad-dresses may be used to define the calling convention. Usu-ally, these functions cannot be called from C/C++.

    4

  • NASM: cdecl calling convention

    int add(int a, int b) { return a+b; }

    ...push bpush acall addadd esp, 8;or pop ebx ; remove parameter; pop ebx ; remove parameter...

    add:; [esp] is the return address,; [esp+4] the first parameter, etc.mov eax, [esp+4]add eax, [esp+8]ret

    5

  • NASM: stdcall calling convention

    int add(int a, int b) { return a+b; }

    ...push bpush acall add...

    add:; [esp] is the return address,; [esp+4] the first parameter, etc.mov eax, [esp+4]add eax, [esp+8]push ebx ; save ebxmov ebx, [esp+4] ; return address (after ebx)sub esp, 16 ; ebx, ret addr, 1st param, 2nd parampush ebx ; restore return addressmov ebx, [esp-12] ; 16-4 for return addressret

    6

  • NASM: cdecl calling convention

    We will consider the cdecl calling convention.

    To avoid the pointer arithmetic we usually follow the conven-tion

    1. On entry to the function the return address is in [esp],the first parameter in [esp+4], etc.

    2. Save ebp:push ebp

    3. Save the current stack pointer in ebp:mov ebp, esp

    4. Now the saved ebp is in [ebp], the return address is in[ebp+4], the first parameter in [esp+8], etc.

    5. Before ret, remember to restore ebp:pop ebp

    7

  • NASM: calling assembler from C

    /* fact.c */

    #include

    extern int factorial(int);

    int main(void){printf("%d\n", factorial(5));return 0;

    }

    8

  • NASM: calling assembler from C++

    // fact.cpp#include

    using namespace std;

    extern "C" int factorial(int);

    int main(void){cout

  • NASM: calling assembler from C/C++

    ; fact.asmSECTION .text ; program

    global factorial ; linuxglobal _factorial ; windows

    factorial:_factorial:

    push ebp ; save base pointermov ebp, esp ; store stack pointerpush ecx ; save ecx;;; start functionmov ecx, [ebp+8] ; ecx = first argumentmov eax, 1 ; eax = 1

    mainloop:cmp ecx, 0 ; if(ecx == 0)jz done ; goto donemul ecx ; else eax = eax * ecxdec ecx ; ecx = ecx - 1jmp mainloop

    done:pop ecx ; restore ecxpop ebp ; restore ebpret ; return from function

    10

  • NASM: calling assembler from C/C++

    Compiling fact.asm on LINUX:

    nasm -f elf -o factasm.o fact.asmgcc -o fact fact.c factasm.og++ -o fact fact.c factasm.o

    Compiling fact.asm on WINDOWS:

    nasm -f win32 -o factasm.o fact.asmgcc -o fact.exe fact.c factasm.og++ -o fact.exe fact.c factasm.o

    11

  • NASM: calling C from assembler

    /* useprintf.c */#include

    extern void use_printf();

    int main(void){use_printf();return 0;

    }

    12

  • NASM: calling C from assembler

    ; printf.asmSECTION .text

    extern printf ; linuxextern _printf ; windowsglobal use_printf ; linuxglobal _use_printf ; windows

    message:db "output some text", 10, 0 ; newline, null terminator

    use_printf:_use_printf:

    push messagecall printfpop eax ; first parameterret

    13

  • NASM: calling C from assembler

    Compiling printf.asm on LINUX:

    nasm -f elf -o printf.o printf.asmgcc -o useprintf useprintf.c printf.o

    Compiling printf.asm on WINDOWS:

    nasm -f win32 -o printf.o printf.asmgcc -o useprintf.exe useprintf.c printf.o

    14

  • NASM: main program in assembler

    #include

    int main(int argc, char *argv[]){int integer1, integer2;printf("Enter the first number: ");scanf("%d", &integer1);printf("Enter the second number: ");scanf("%d", &integer2);printf("%d\n", integer1+integer2);return 0;

    }

    15

  • NASM: main program in assembler

    ; add1.asm

    SECTION .data

    message1: db "Enter the first number: ", 0message2: db "Enter the second number: ", 0formatin: db "%d", 0formatout: db "%d", 10, 0 ; newline, nul terminator

    integer1: times 4 db 0 ; 32-bits integer = 4 bytesinteger2: times 4 db 0 ;

    SECTION .text

    global main ; linuxglobal _main ; windows

    extern scanf ; linuxextern printf ; linuxextern _scanf ; windowsextern _printf ; windows

    16

  • NASM: main program in assembler (cont.)

    ; add1.asm_main:main:

    push ebx ; save registerspush ecx

    push message1call printfadd esp, 4 ; remove parameters

    push integer1 ; address of integer1 (second parameter)push formatin ; arguments are right to left (first parameter)call scanfadd esp, 8 ; remove parameters

    push message2call printfadd esp, 4 ; remove parameters

    push integer2 ; address of integer2push formatin ; arguments are right to leftcall scanfadd esp, 8 ; remove parameters

    17

  • NASM: main program in assembler (cont.)

    ; add1.asm (main)

    mov ebx, dword [integer1]mov ecx, dword [integer2]add ebx, ecx ; add the values

    push ebxpush formatoutcall printf ; display the sumadd esp, 8 ; remove parameters

    pop ecxpop ebx ; restore registers in reverse ordermov eax, 0 ; no errorret

    18

  • NASM: main program in assembler (cont.)

    Compiling add1.asm on LINUX:

    nasm -f elf -o add1.o add1.asmgcc -o add1 add1.o

    Compiling add1.asm on WINDOWS:

    nasm -f win32 -o add1.o add1.asmgcc -o add1.exe add1.o

    19

  • NASM: command line arguments

    #include #include

    int main(int argc, char *argv[]){int integer1, integer2;integer1 = atoi(argv[1]);integer2 = atoi(argv[2]);printf("%d\n", integer1+integer2);return 0;

    }

    Below we provide our own atoi function with a custom callingconvention.

    20

  • NASM: command line arguments

    ; add2.asm

    SECTION .text

    global main ; linuxglobal _main ; windows

    extern printf ; linuxextern _printf ; windows

    21

  • NASM: command line arguments (cont.)

    ; add2.asm

    ; we assume the argument is in eaxatoi:

    push ebxpush ecxmov ebx, eaxmov eax, 0

    atoi_repeat:cmp byte [ebx], 0je atoi_donemov ecx, 10mul ecx ; eax *= 10, make place for the next digitmov cl, byte [ebx]sub cl, 0add eax, ecxinc ebxjmp atoi_repeat

    atoi_done:pop ecxpop ebxret

    22

  • NASM: command line arguments (cont.)

    ; add2.asm

    _main:main:

    push ebpmov ebp, esppush ebx ; save registerspush ecx

    cmp dword [ebp+8], 3 ; argcjne main_end

    mov ebx, [ebp+12] ; argv

    mov eax, [ebx+4] ; argv[1]call atoimov ecx, eax ; store the result

    mov eax, [ebx+8] ; argv[2]call atoi

    add eax, ecx ; add the values

    23

  • NASM: command line arguments (cont.)

    ; add2.asm (main)

    push eaxpush formatoutcall printf ; display the sumadd esp, 8 ; remove parameters

    main_end:pop ecxpop ebx ; restore registers in reverse orderpop ebpmov eax, 0 ; no errorret

    SECTION .data

    formatout: db "%d", 10, 0 ; newline, nul terminator

    24

  • NASM: command line arguments (cont.)

    Compiling add2.asm on LINUX:

    nasm -f elf -o add2.o add2.asmgcc -o add2 add2.o

    Compiling add2.asm on WINDOWS:

    nasm -f win32 -o add2.o add2.asmgcc -o add2.exe add2.o

    25

  • NASM: DOS interrupts

    Operating systems provide different mechanisms for using theoperating system facilities. The DOS operating system usesinterrupts, specifically interrupt 21h. Interrupts are similar tofunctions using CALL and RET, however parameters are usu-ally passed via registers and we use INT and IRET. Hardwareevents also trigger interrupts which is used when writing devicedrivers.

    A list of interrupt calls can be found at:

    http://www.cs.cmu.edu/~ralf/files.htmlhttp://www.ctyme.com/rbrown.htm

    26

  • NASM: DOS interrupts

    DOS provides functionality via interrupt 21h. The requestedfunction must be specified in AH. In the example program weuse the following functions.

    AH Function02h Write the ASCII character in DL to stdout09h Write the $ terminated string at DS:DX to stdout0Bh Check stdin (AL = 0 no character available)2Ch Get time (hour: CH, minute: CL, second: DH)4Ch Terminate and return to DOS with return code AL

    27

  • NASM: DOS clock

    The following program displays a clock. Whenever the timechanges, the displayed time must be updated.

    The ORG directive is used to tell NASM that the programmstarts at byte 100h, which is the format for a DOS .COM file.In this format, all segment registers point to the same segment,so it is not neccessary to change DS when printing a string.

    ORG 100h ; DOS COM file

    main:MOV AH, 0Bh ; check stdinINT 21h ; call DOSCMP AL, 0 ; no characterJNE return ; user pressed a key - exit

    28

  • NASM: DOS clock

    MOV AH, 2Ch ; get timeINT 21h ; call DOS

    CMP DH, BHJE main ; seconds did not changeMOV BH, DH

    PUSH DXMOV AH, 09h ; write stringMOV DX, clear_lineINT 21hPOP DX

    29

  • NASM: DOS clock

    XOR AX, AXMOV AL, CH ; hourCALL show_integer

    MOV AH, 02hMOV DL, :INT 21h

    XOR AX, AXMOV AL, CL ; minuteCALL show_integer

    MOV AH, 02hMOV DL, :INT 21h

    XOR AX, AXMOV AL, DH ; secondCALL show_integer

    JMP main

    30

  • NASM: DOS clock

    return:MOV AH, 09h ; write stringMOV DX, new_lineINT 21h

    MOV AH, 4Ch ; terminate programMOV AL, 0 ; no errorINT 21h ; call dos

    clear_line: db " ", 13, $new_line: db 10, 13, $buffer: times 100 db 0

    db $

    31

  • NASM: DOS clock

    ; show ascii representation of AXshow_integer:PUSH BXPUSH CXPUSH DXMOV BX, bufferADD BX, 100MOV CX, 10

    show_integer_loop:CMP AX, 0JE show_integer_doneMOV DX, 0DIV CXADD DL, 0DEC BX ; write number backwardsMOV BYTE [BX], DLJMP show_integer_loop

    32

  • NASM: DOS clock

    show_integer_done:CMP BYTE [BX], $ ; empty stringJNE not_zeroDEC BXMOV BYTE [BX], 0

    not_zero:MOV AH, 09h ; write stringMOV DX, BXINT 21hPOP DXPOP CXPOP BXRET

    33