55
Chapter 2: Assemblers Mrs. Sunita M Dol (Aher), Assistant Professor, Computer Science and Engineering Department, Walchand Institute of Technology, Solapur, Maharashtra

Chapter 2a Assembler-Introduction

Embed Size (px)

Citation preview

Chapter 2: Assemblers

Mrs. Sunita M Dol (Aher),Assistant Professor,

Computer Science and Engineering Department,Walchand Institute of Technology, Solapur, Maharashtra

05/02/2023 Mrs. Sunita M Dol, CSE Dept 2

• Elements of Assembly Language Programming• A simple Assembly Scheme• Pass Structure of Assemblers• Design of a two pass Assembler• A Single Pass Assembler for IBM PC

2. Assemblers

05/02/2023 Mrs. Sunita M Dol, CSE Dept 3

• Elements of Assembly Language Programming• A simple Assembly Scheme• Pass Structure of Assemblers• Design of a two pass Assembler• A Single Pass Assembler for IBM PC

2. Assemblers

05/02/2023 Mrs. Sunita M Dol, CSE Dept 4

Assembly Language Programming

• An assembly language is a machine dependent, low level programming language which is specific to a certain computer system.

• Compared to machine language of computer, it provides three basic features:– Mnemonic operation codes

• Eliminates the need to memorize numeric operation code .

– Symbolic operands• Symbolic names can be used.

– Data declaration• Data can be declared in any form e.g. -5, 10.5 etc.

05/02/2023 Mrs. Sunita M Dol, CSE Dept 5

AssemblerAssembly Language Program

Machine Language

Figure: Assembler

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 6

• Statement Format

[Label] <Opcode> <operand spec> [<operand spec>….]1. Label:- Is optional.2. Opcode:- Symbolic opcode3. Operand Spec:- <operand spec> has the following syntax

<symbolic name> [+<displacement>][(<index register>)] e.g. AREA

AREA + 5 AREA (4) AREA + 5(4)

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 7

• A Simple Assembly Language– Each statement has two operands

• The first operand is always a register (AREG, BREG, CREG and DREG)

• The second operand refers to a memory word.

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 8

Instruction Opcode

Assembly Mnemonic

Remarks

00 STOP Stop Execution

01 ADD Op1 Op1+ Op2

02 SUB Op1 Op1 – Op2

03 MULT Op1 Op1* Op2

04 MOVER CPU Reg Memory operand05 MOVEM Memory CPU Reg06 COMP Sets Condition Code

07 BC Branch on Condition

08 DIV Op1 Op1/ Op209 READ Operand 2 input Value10 PRINT Output Operand2

Figure: Mnemonic Operation Codes

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 9

• Instruction Format

Assembly Language Programming

sign opcode register operand

memory operand

Figure: Instruction Format

05/02/2023 Mrs. Sunita M Dol, CSE Dept 10

• Assembly Language to Machine Language– Find address of variables and labels.– Replace Symbolic address by numeric address.– Replace Symbolic opcode by machine opcode.– Reserve storage for data.

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 11

START 101READ NMOVER BREG, ONEMOVEM BREG, TERM

AGAIN MULT BREG, TERMMOVER CREG, TERMADD CREG, ONEMOVEM CREG, TERMCOMP CREG, NBC LE, AGAINMOVEM BREG, RESULTPRINT RESULTSTOP

N DS 1RESULT DS 1ONE DC ‘1’TERM DS 1

END

Assembly Language Programming

Figure: Sample program to find n!

05/02/2023 Mrs. Sunita M Dol, CSE Dept 12

START 101READ N 101) + 09 0 113MOVER BREG, ONE 102) + 04 2 115MOVEM BREG, TERM 103) + 05 2 116

AGAIN MULT BREG, TERM 104) + 03 2 116MOVER CREG, TERM 105) + 04 3 116ADD CREG, ONE 106) + 01 3 115MOVEM CREG, TERM 107) + 05 3 116COMP CREG, N 108) + 06 3 116BC LE, AGAIN 109) + 07 2 104MOVEM BREG, RESULT 110) + 05 2 114PRINT RESULT 111) + 10 0 114STOP 112) + 00 0 000

N DS 1 113)RESULT DS 1 114)ONE DC ‘1’ 115) + 00 0 001TERM DS 1 116)

END

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 13

Variable Address

AGAIN 104

N 113

RESULT 114

ONE 115

TERM 116

Figure: After LC Processing

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 14

LC Opcode Register Address101 09 0 113

102 04 2 115

103 05 2 116

104 03 2 116

105 04 3 116

106 01 3 115

107 05 3 116

108 06 3 113

109 07 2 104

110 05 2 114

111 10 0 114

112 00 0 000

113

114

115 00 0 001

116

Assembly Language Programming• Machine Code

05/02/2023 15

START 101READ XREAD YMOVER AREG, XADD AREG, YMOVEM AREG, RESULTPRINT RESULTSTOP

X DS 1Y DS 1

RESULT DS 1END

Figure: Sample program to find X+Y

Assembly Language Programming

Mrs. Sunita M Dol, CSE Dept

05/02/2023 Mrs. Sunita M Dol, CSE Dept 16

START 101READ XREAD YMOVER AREG, XADD AREG, YMOVEM AREG, RESULTPRINT RESULTSTOP

X DS 1Y DS 1RESULT DS 1

END

LC101102103104105106107108109110

+ 09 0 108+ 09 0 109+ 04 1 108+ 01 1 109+ 05 0 110+ 10 0 110+ 00 0 000

Opcode Register Memory operand

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 17

Variable Address

X 108

Y 109

RESULT 110

Figure: After LC Processing

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 18

LC Opcode Register Address101 09 0 108

102 09 0 109

103 04 1 108

104 01 1 109

105 05 0 110

106 10 0 110

107 00 0 000

108

109

110

111

Assembly Language Programming• Machine Code

05/02/2023 Mrs. Sunita M Dol, CSE Dept 19

• Assembly Language Statement– Imperative Statement.

• Indicates an action to be taken during execution of a program.• Eg: MOV, ADD, MULT, etc.

– Declaration Statement.• To reserve memory for variable. [Label] DS <constant> eg: X DS 5 [Label] DC ‘<value>’ eg: X DC ‘3’

– Assembler Directives• Instructs the assembler to perform certain action during

assembly of a program.

START <constant>

END

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 20

• Literals and Constants– Literal cannot be changed during program execution– Literal is more safe and protected than a constant.– Literals appear as a part of the instruction.– e.g.

ADD AREG, =‘5’ => ADD AREG, FIVE--------------------

FIVE DC ‘5’

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 21

• Assembler Directives– START <constant> : The first word of the target program

by the assembler should be placed in the memory with the address <constant>e.g. START 100

– END [<operand spec>] : It indicates the end of source program. The optional <operand spec> indicates the address of instruction where the execution of the program should begin.

Assembly Language Programming

05/02/2023 Mrs. Sunita M Dol, CSE Dept 22

• Advantages of Assembly Language

Assembly Language Programming

START 101READ XREAD YMOVER AREG, XADD AREG, YDIV AREG, TWOMOVEM AREG, RESULTPRINT RESULTSTOP

X DS 1Y DS 1RESULT DS 1TWO DC ‘2’

END

LC101102103104105106107108109110111112

+ 09 0 109+ 09 0 110+ 04 1 109+ 01 1 110+ 08 2 112+ 05 0 111+ 10 0 111+ 00 0 000

+ 00 0 002

Opcode Register Memory operand

05/02/2023 Mrs. Sunita M Dol, CSE Dept 23

• Elements of Assembly Language Programming• A Simple Assembly Scheme• Pass Structure of Assemblers• Design of a two pass Assembler• A Single Pass Assembler for IBM PC

2. Assemblers

05/02/2023 Mrs. Sunita M Dol, CSE Dept 24

• Design Specification for an Assembler– Four step approach

• Identify the information necessary to perform a task.• Design a suitable data structure to record the information.• Determine the processing necessary to obtain and maintain the

information.• Determine the processing necessary to perform the task.

A Simple Assembly Scheme

05/02/2023 Mrs. Sunita M Dol, CSE Dept 25

• Synthesis Phase– Consider instruction MOVER AREG, X

A Simple Assembly Scheme

START 101READ XREAD YMOVER AREG, XADD AREG, YMOVEM AREG, RESULTPRINT RESULTSTOP

X DS 1Y DS 1RESULT DS 1

END

LC101102103104105106107108109110

+ 09 0 108+ 09 0 109+ 04 1 108+ 01 1 109+ 05 0 110+ 10 0 110+ 00 0 000

05/02/2023 Mrs. Sunita M Dol, CSE Dept 26

• Synthesis Phase– we must have the following information to synthesize the

machine instruction corresponding to this statement:• Address of the memory word with which name X is

associated.• Machine operation code corresponding to the mnemonic

MOVER.– The first item of information depends on the source

program. Hence it must be available by the analysis phase. The second item of information does not depend on the source program

A Simple Assembly Scheme

05/02/2023 Mrs. Sunita M Dol, CSE Dept 27

• Synthesis Phase– we consider the use of two data structures during the

synthesis phase:• Symbol table• Mnemonics table

A Simple Assembly Scheme

05/02/2023 Mrs. Sunita M Dol, CSE Dept 28

• Analysis Phase– The primary function performed by the analysis phase is

the building of the symbol table. – For this purpose it must determine the address with

which the symbolic names in a program are associated. – To determine the address of any instruction, we must fix

the address of all program elements preceding it. This function is called memory allocation.

– To implement memory allocation a data structure called location counter is introduced.

A Simple Assembly Scheme

05/02/2023 Mrs. Sunita M Dol, CSE Dept 29

• Analysis Phase– It is initialized to the constant specified in the START

statement.– The location counter is always made to contain the

address of the next memory word in the target program.

A Simple Assembly Scheme

05/02/2023 Mrs. Sunita M Dol, CSE Dept 30

A Simple Assembly Scheme

Analysis Phase

Synthesis Phase

Mnemonic Opcode lengthADD 01 1

SUB 02 1

Source Program

Target Program

Figure: Overview of Two Pass Assembler

Symbol AddressAGAIN 104

N 113

Symbol Table

Mnemonic Table

Data access

Control access

05/02/2023 Mrs. Sunita M Dol, CSE Dept 31

• Analysis Phase– Isolate the label, mnemonic opcode and operand fields

of a statement.– If a label is present, enter the pair(symbol, <LC counter>)

in a new entry of the symbol table– Check the validity of the mnemonic opcode through a

look-up in the mnemonic table.– Perform LC processing i.e. update the value contained in

LC by considering the opcode and operands of the statement.

A Simple Assembly Scheme

05/02/2023 Mrs. Sunita M Dol, CSE Dept 32

• Synthesis phase– Obtain the machine opcode corresponding to the

mnemonic from mnemonics table.– Obtain address of a memory operand from the symbol

table.– Synthesize a machine instruction or the machine form of

a constant, as the case may be.

A Simple Assembly Scheme

05/02/2023 Mrs. Sunita M Dol, CSE Dept 33

• Elements of Assembly Language Programming• A Simple Assembly Scheme• Pass Structure of Assemblers• Design of a two pass Assembler• A Single Pass Assembler for IBM PC

2. Assemblers

05/02/2023 Mrs. Sunita M Dol, CSE Dept 34

• Two pass translation– Two pass translation of an assembly language program

can handle forward references easily. – LC processing is performed in the first pass and symbols

defined in the program are entered into the symbol table. – The second pass synthesizes the target form using the

address information found in the symbol table.

Pass Structure of Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 35

• Two pass translation– The first pass constructs an intermediate representation

of the source program for use by the second pass .– This representation consists of two main components—

• data structures, e.g. the symbol table, and • a processed form of the source program

Pass Structure of Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 36

Pass I Pass II

Intermediate Code

Source Program

Target Program

Figure: Overview of Two Pass Assembler

Data Structures

Data access

Control access

Pass Structure of Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 37

• Single pass translation– LC processing and construction of the symbol table

proceed as in two pass translation. – The problem of forward references is tackled using a

process called backpatching. – The operand field of an instruction containing a forward

reference is left blank initially. – The address of the forward referenced symbol is put into

this field when its definition is encountered.

Pass Structure of Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 38

• Single pass translation– The need for inserting the operand’s address at a later

stage can be indicated by adding an entry to the table of incomplete instructions (TII).

– In TII, each entry is a pair (<instruction address>,<symbol>)

Pass Structure of Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 39

• Single pass translation– By the time the END statement is processed, the symbol

table would contain the addresses of all symbols defined in the source program and TII would contain information describing all forward references.

– The assembler can now process each entry in TII to complete the concerned instruction.

Pass Structure of Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 40

Pass Structure of Assembler

• Single Pass Translation– The problem of forward reference can be handled

using a technique called as back patching.– The need for inserting the second operand’s address

at a later stage can be indicated by adding an entry to the Table of Incomplete Instruction (TII)

– The entry in TII is a pair(<instruction address>, <symbol>)

05/02/2023 Mrs. Sunita M Dol, CSE Dept 41

• Single Pass Translation Example

START 100MOVER AREG, XADD BREG, ONEADD CREG, TENSTOP

X DC ‘5’ONE DC ‘1’TEN DC ‘10’

END

Pass Structure of Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 42

START 100MOVER AREG, XADD BREG, ONEADD CREG, TENSTOP

X DC ‘5’ONE DC ‘1’TEN DC ‘10’

END

100 04 1 _ _ _101 01 2 _ _ _102 06 3 _ _ _103 00 0 000104 105 106

Instruction Address

Symbol Making a forward reference

100 X

101 ONE

102 TEN

Figure : TII (Table of Incomplete Instruction)

Pass Structure of Assembler

Symbol AddressX 104

ONE 105

TEN 105

Figure : Symbol Table

• Single Pass Translation Example

05/02/2023 Mrs. Sunita M Dol, CSE Dept 43

Pass Structure of Assembler

START 100MOVER AREG, XADD BREG, ONEADD CREG, TENSTOP

X DC ‘5’ONE DC ‘1’TEN DC ‘10’

END

100 04 1 104101 01 2 105102 06 3 106103 00 0 000104 105 106

• Single Pass Translation Example

05/02/2023 Mrs. Sunita M Dol, CSE Dept 44

• Elements of Assembly Language Programming• A Simple Assembly Scheme• Pass Structure of Assemblers• Design of a Two Pass Assembler• A Single Pass Assembler for IBM PC

2. Assemblers

05/02/2023 Mrs. Sunita M Dol, CSE Dept 45

Design of a Two Pass Assembler

• Pass I:-1. Separate the symbol, mnemonic, opcode and

operand.2. Build Symbol Table.3. Perform LC Processing.4. Construct Intermediate Representation.

• Pass II:-1. Process IR to synthesize the target program.

05/02/2023 Mrs. Sunita M Dol, CSE Dept 46

• Advanced Assembler Directives– ORIGIN:

• The syntax isORIGIN <address specification>where address specification is an <operand spec> or <constant>

• This directives indicates that LC should be set to the address given by <address spec>.

• The ORIGIN is useful when the target program does not consist of consecutive memory word.

Design of a Two Pass Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 47

Design of a Two Pass Assembler1 START 2002 MOVER AREG, =’5’ 200) + 04 1 2113 MOVEM AREG, A 201) + 05 1 2174 LOOP MOVER AREG, A 202) + 04 1 2175 MOVER CREG, B 203) + 05 3 2186 ADD CREG, =’1’ 204) + 01 2 2127 …..12 BC ANY, NEXT 210) + 07 6 21413 LTORG

=’5’ 211) + 00 0 005=’1’ 212) + 00 0 001

14 …..15 NEXT SUB AREG, =’1’ 214) + 02 1 21916 BC LT, BACK 215) + 07 1 20217 LAST STOP 216) + 00 0 00018 ORIGIN LOOP+219 MULT CREG, B 204) + 03 3 21820 ORIGIN LAST+121 A DS 1 217)22 BACK EQU LOOP23 B DS 1 218)24 END25 =’1’ 219)

Figure : An assembly language illustrating ORIGIN

Statement number 18 sets LC to the value 204 since the symbol LOOP is associated with the address 202

05/02/2023 Mrs. Sunita M Dol, CSE Dept 48

• Advanced Assembler Directives– EQU:

• The syntax is<symbol> ORIGIN <address specification> where address specification is an <operand spec> or <constant>• The EQU statement defines the symbol to represent <address

spec>.• No LC processing is implied.

Design of a Two Pass Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 49

Design of a Two Pass Assembler1 START 2002 MOVER AREG, =’5’ 200) + 04 1 2113 MOVEM AREG, A 201) + 05 1 2174 LOOP MOVER AREG, A 202) + 04 1 2175 MOVER CREG, B 203) + 05 3 2186 ADD CREG, =’1’ 204) + 01 2 2127 …..12 BC ANY, NEXT 210) + 07 6 21413 LTORG

=’5’ 211) + 00 0 005=’1’ 212) + 00 0 001

14 …..15 NEXT SUB AREG, =’1’ 214) + 02 1 21916 BC LT, BACK 215) + 07 1 20217 LAST STOP 216) + 00 0 00018 ORIGIN LOOP+219 MULT CREG, B 204) + 03 3 21820 ORIGIN LAST+121 A DS 1 217)22 BACK EQU LOOP23 B DS 1 218)24 END25 =’1’ 219)

Figure : An assembly language illustrating ORIGIN

Statement 22 introduces the symbol BACK to represent the operand LOOP.

05/02/2023 Mrs. Sunita M Dol, CSE Dept 50

• Advanced Assembler Directives– LTORG:

• The LTORG statement permits a programmer to specify where literals should be placed .

• By default, assembler places the literals after the END statement.

• At every LTORG statement, as also at the END statement, the assembler allocates memory to the literals of a literal pool.

• The pool contains all literals used in the program since start of the program or since the last LTORG statement.

Design of a Two Pass Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 51

Design of a Two Pass Assembler1 START 2002 MOVER AREG, =’5’ 200) + 04 1 2113 MOVEM AREG, A 201) + 05 1 2174 LOOP MOVER AREG, A 202) + 04 1 2175 MOVER CREG, B 203) + 05 3 2186 ADD CREG, =’1’ 204) + 01 2 2127 …..12 BC ANY, NEXT 210) + 07 6 21413 LTORG

=’5’ 211) + 00 0 005=’1’ 212) + 00 0 001

14 …..15 NEXT SUB AREG, =’1’ 214) + 02 1 21916 BC LT, BACK 215) + 07 1 20217 LAST STOP 216) + 00 0 00018 ORIGIN LOOP+219 MULT CREG, B 204) + 03 3 21820 ORIGIN LAST+121 A DS 1 217)22 BACK EQU LOOP23 B DS 1 218)24 END25 =’1’ 219)

Figure : An assembly language illustrating ORIGIN

The literals =’5’ and =’1’ are added to the literal pool in the statement 2 and 6 respectively. The first LTORG statement allocates the addresses 211 and 212 to the values ‘5’ and ‘1’. A new literal pool is now started. The value ‘1’ is put into this pool in statement 15.

05/02/2023 Mrs. Sunita M Dol, CSE Dept 52

START 200MOVER AREG, =‘5’MOVEM AREG, X

L1 MOVER BREG, =‘2’ORIGIN L1+3LTORG

NEXT ADD AREG,=‘1’SUB BREG,=‘2’BC LT, BACKLTORG

BACK EQU L1ORIGIN NEXT+5MULT CREG,=‘4’STOP

X DS 1END

LC

200 +04 1 205201 +05 1 214202 +04 1 206

205 +00 0 005206 +00 0 002207 +01 1 210208 +02 2 211209 +07 1 202

210 +00 0 001211 +00 0 002

212 +03 3 215213 +00 0 000214215 +00 0 004

Design of a Two Pass Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 53

• Pass I uses the following data structures1. Machine Opcode table (OPTAB)2. Symbol Table (SYMTAB)3. Literal Table (LITTAB)4. Pool Table (POOLTAB)

Design of a Two Pass Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 54

• Pass-I Data Structures– OPTAB

• OPTAB contains the fields – Mnemonic opcode – Class : the class indicate whether the opcode corresponding to

an imperative statement (IS), a declaration statement (DL) or an assembler directives (AD).

– Mnemonic info: If an imperative statement, the mnemonic info field contains the pair (machine opcode, instruction length) else it contains the id of a routine to handle the declaration or directive statement.

• A SYMTAB entry contains the fields address and length.• A LITTAB entry contains the fields literal and address.

Design of a Two Pass Assembler

05/02/2023 Mrs. Sunita M Dol, CSE Dept 55

START 200MOVER AREG, =‘5’MOVEM AREG, X

L1 MOVER BREG, =‘2’ORIGIN L1+3LTORG

NEXT ADD AREG,=‘1’SUB BREG,=‘2’BC LT, BACKLTORG

BACK EQU L1ORIGIN NEXT+5MULT CREG,=‘4’STOP

X DS 1END

LC200 +04 1 205201 +05 1 214202 +04 1 206

205 +00 0 005206 +00 0 002207 +01 1 210208 +02 2 211209 +07 1 202

210 +00 0 001211 +00 0 002

212 +03 3 215213 +00 0 000214215 +00 0 004