21
CPU CISC example: Intel x86 Calcolatori Elettronici e Sistemi Operativi x86: history CPU Year Data Bus Max. Mem. Transistors Clock MHz Av. MIPS Level-1 Caches 8086 1978 16 1MB 29K 5-10 0.8 80286 1982 16 16MB 134K 8-12 2.7 80386 1985 32 4GB 275K 16-33 6 80486 1989 32 4GB 1.2M 25-100 20 8Kb Pentium 1993 64 4GB 3.1M 60-233 100 8K Instr + 8K Data Pentium Pro 1995 64 64GB 5.5M +15.5M 150-200 440 8K + 8K + Level2 Pentium II 1997 64 64GB 7M 266-450 466- 16K+16K + L2 Pentium III 1999 64 64GB 8.2M 500-1000 1000- 16K+16K + L2 Pentium 4 2001 64 64GB 42M 1300-2000 8K + L2 x86: history Intel introduced microprocessors in 1971 4-bit microprocessor 4004 (1971) 8-bit microprocessors 8008 (1972) 8080 (1974) 8085 (1975) 16-bit processors 8086 introduced in 1978 20-bit address bus, 16-bit data bus 8088 (1979) a less expensive version of 8086 Uses 8-bit data bus Can address up to 4 segments of 64 KB Referred to as the real mode first x86 CPU x86: history 80186 (1982) A faster version of 8086 16-bit data bus and 20-bit address bus Improved instruction set 80286 (1982) 24-bit address bus 16 MB address space Enhanced with memory protection capabilities Introduced protected mode Segmentation in protected mode is different from the real mode Backwards compatible

Max. Transistors Clock MHz Bus Mem. CPU CISC example ... · CPU CISC example: Intel x86 Calcolatori Elettronici e Sistemi Operativi x86: history CPU Year Data Bus Max. Mem. Transistors

  • Upload
    vuphuc

  • View
    228

  • Download
    0

Embed Size (px)

Citation preview

CPU CISC example:

Intel x86

Calcolatori Elettronici e Sistemi Operativix86: history

CPU Year Data

Bus

Max.

Mem.

Transistors Clock MHz Av. MIPS Level-1 Caches

8086 1978 16 1MB 29K 5-10 0.8

80286 1982 16 16MB 134K 8-12 2.7

80386 1985 32 4GB 275K 16-33 6

80486 1989 32 4GB 1.2M 25-100 20 8Kb

Pentium 1993 64 4GB 3.1M 60-233 100 8K Instr + 8K Data

Pentium

Pro

1995 64 64GB 5.5M

+15.5M

150-200 440 8K + 8K + Level2

Pentium II 1997 64 64GB 7M 266-450 466- 16K+16K + L2

Pentium III 1999 64 64GB 8.2M 500-1000 1000- 16K+16K + L2

Pentium 4 2001 64 64GB 42M 1300-2000 8K + L2

x86: history

� Intel introduced microprocessors in 1971

� 4-bit microprocessor 4004 (1971)

� 8-bit microprocessors� 8008 (1972)� 8080 (1974)� 8085 (1975)

� 16-bit processors� 8086 introduced in 1978

� 20-bit address bus, 16-bit data bus

� 8088 (1979)� a less expensive version of 8086� Uses 8-bit data bus

� Can address up to 4 segments of 64 KB

� Referred to as the real mode

first x86 CPU

x86: history

� 80186 (1982)

� A faster version of 8086

� 16-bit data bus and 20-bit address bus

� Improved instruction set

� 80286 (1982)

� 24-bit address bus

� 16 MB address space

� Enhanced with memory protection capabilities

� Introduced protected mode

� Segmentation in protected mode is different from the real mode

� Backwards compatible

x86: history

� 80386 (1985)

� First 32-bit processor

� 32-bit data bus and 32-bit address bus

� 4 GB address space

� Segmentation can be turned off (flat model)

� Introduced paging

� 80486 (1989)

� Improved version of 386

� Combined coprocessor functions for performing floating-point arithmetic

� Added parallel execution capability to instruction decode and execution

units

� Achieves scalar execution of 1 instruction/clock

� Later versions introduced energy savings for laptops

first 32-bit CPU

x86: history

� Pentium (1993)

� Similar to 486 but with 64-bit data bus

� Wider internal datapaths

� 128- and 256-bit wide

� Added second execution pipeline

� Superscalar performance

� Two instructions/clock

� Doubled on-chip L1 cache

� 8 KB data

� 8 KB instruction

� Added branch prediction

x86: history

� Pentium Pro (1995)

� Three-way superscalar

� 3 instructions/clock

� 36-bit address bus

� 64 GB address space

� Introduced dynamic execution

� Out-of-order execution

� Speculative execution

� In addition to the L1 cache

� Has 256 KB L2 cache

x86: history

� Pentium II (1997)

� Introduced multimedia (MMX) instructions

� Doubled on-chip L1 cache

� 16 KB data

� 16 KB instruction

� Introduced comprehensive power management features

� Sleep

� Deep sleep

� In addition to the L1 cache

� Has 256 KB L2 cache

� Pentium III, Pentium 4,...

� Pentium 4F (2005) first x86-64

IA-32: P6

� 3-ways superscalar, 12-stages pipelined

� branch prediction

� out-of-order execution

� speculative execution

� mode of operation

� real mode (emulates a 8086)

� protected mode (32-bit environment)

� system management mode

Example

� Core i7-3970X: Sandy Bridge-E (6 cores) 32 nm (2.27 billion transistors)

� Caches:� µ-ops cache: 1536 µ-ops per core� L1: 32 KB (I$) + 32 KB (D$) per core [8-way – line: 16-B]� L2: 256 KB per core [8-way – line: 64-B]� L3: 15 MB shared [16-way – line: 64-B]

� 3.5 GHz (memory bus: 800 MHz) - 150 W

� Pipeline: 19 stages� µ-op hit � 5 stages skipped

� 4 instruction decoders (instruction to µ-ops translators)� SIMD instructions

� MMX� SSE, SSE2, SSE3, SSE4

� AES instructions� AVX: Advanced Vector Extensions� EM64T: Extended Memory 64 technology� NX / XD / Execute disable bit� HT: Hyper-Threading technology (Hardware multithreading: factor 2)� Virtualization support

� VT-x: Virtualization technology� VT-d: Virtualization for directed I/O

� TBT: Turbo Boost technology� Enhanced SpeedStep technology

Operating modes

� Real-address mode

� Behaves as an 8086 (with a few extensions)

� Protected mode

� Native operating mode

� System management mode

� To handle power management and OEM variants

� Virtual-8086 mode

� To emulate an 8086 inside the protected mode

GP registers

Register Special use

� EAX: accumulator for operands and results data

� EBX: pointer to data

� ECX: counter for string and loop operations

� EDX: I/O pointer

� ESI: pointer to data; source pointer for string operations

� EDI: pointer to data (ES segment); destination ptr for string operations

� ESP: stack pointer

� EBP: pointer to data on the stack

8086: registers

Eight 16-bit

GP-Registers

SS

CS

DS

ES

FLAGS

16-bit Status-flags Register

IP

16-bit Instruction Pointer

R7

R0

Eight 80-bit

FP-Registers

CR

SR

TR

FP Control Registers

(16-bit)

AH AL AX

BH BL

CH CL

DH DL

BX

CX

DX

DI

SI

BP

SP

Four 16-bit

Segment Registers

8087

I-fetch: MEM[CS<<4 + IP]

D-fetch: MEM[DS<<4 + address] (other segment selectors can be forced)

mov AX, [BX+4]

mov CX, CS:[DX+4]

stack access: MEM[SS<<4 + SP]

POP AX

PUSH BX

IA-32: registers

EDX

EAX

EBX

ECX

ESP

EDI

ESI

EBP

Eight 32-bit

GP-Registers

SS

CS

DS

ES

FS

GS

Six 16-bit

Segment Registers

EFLAGS

32-bit Status-flags Register

EIP

32-bit Instruction Pointer

R7

R0

Eight 80-bit

FP-Registers

CR

SR

TR

FP Control Registers

(16-bit)

IPR 48 bits

48 bitsDPR

OPR

MMX3

MMX0

MMX1

MMX2

MMX7

MMX4

MMX5

MMX6

Eight 64-bit

MMX-Registers

XMM3

XMM0

XMM1

XMM2

XMM7

XMM4

XMM5

XMM6

Eight 128-bit XMM-Registers11 bits

AH AL AX

BH BL

CH CL

DH DL

BX

CX

DX

DI

SI

BP

SP

Status-flags Register (EFLAGS)

31 30 29 28 27 26 25 20 19 16 15 10 9 8 7 6 5 4 024 23 22 21 18 17 14 13 12 11 3 2 1

0 ID VIP VIF AC VM RF 0 NT IOPL IF TF 0 0 1OF DF SF ZF AF PF CF

User flags:

OF: Overflow Flag

DF: Direction Flag (set by sw to control string operations: MOVS, CMPS, SCAS, LODS, STOS)

SF: Sign Flag

ZF: Zero Flag

AF: Auxiliary Carry Flag (carry generated from bit 3; used for BCD operations)

PF: Parity Flag (least significant bit of the result)

CF: Carry Flag

Status-flags Register (EFLAGS)

31 30 29 28 27 26 25 20 19 16 15 10 9 8 7 6 5 4 024 23 22 21 18 17 14 13 12 11 3 2 1

0 0 OF DF SF ZF 0 AF 0 PF 1 CFID VIP VIF AC VM RF NT IOPL IF TF

System flags:

ID: ID Flag (if writable, CPUID instruction is supported)

VIP: Virtual Interrupt Pending (to record that a virtual interrupt is pending: only written by sw)

VIF: Virtual Interrupt Flag (1: virtual interrupt enabled)

AC: Alignment Check (1: alignment check exceptions enabled)

VM: Virtual-8086 Mode (set to enable virtual-8086 mode)

RF: Resume Flag (1: debug exceptions disabled, to allow resuming after a breakpoint)

NT: Nested Task (1: a CALL, an interrupt, or an exception caused a task switch)

IOPL: I/O Privilege Level (max privilege level required for accessing IO address space)

IF: Interrupt Enable Flag (1: interrupt enabled)

TF: Trap Flag (1: single-step mode for debugging)

IA-32: other registers

� Control registers

� CR0, CR1, CR2, CR3, CR4� CR0 also specifies the if the protected mode is active

� 3 functioning modes (0-2: privileged modes – 3 user mode)

� Memory management registers� GDTR, IDTR, LDTR

� for protected mode memory management

� Memory type range registers (MTRRs)

� Debug registers

� DR0, ..., DR7

� Machine specific registers (MSRs)

� Machine check registers

� Performance monitoring counters

Memory model

� Segmented and paged memory

� Segment and offset: logical address

� Logical address � Linear address

� Segment Descriptor Table

� Linear address � Physical address

� Page Table

8086 memory model

� 16 bit processor

� 20 bit address bus

� max addressable memory: 1MB

� 16 bit data bus

� 8 bit for 808816 bit address

16 bit segment register <<4

+

20-bit

base address

20-bit address (physical)

8086 memory model

Add

ress

spa

ce

0

220-1

seg base

offset

20 bit

16 bit

216 B = 64 KB

seg. register

16 bit

<<4

� only address translation

� no protection

IA-32: memory models

� Segmented memory model

� Flat memory model

� 32 bit linear address space

� Real-address memory model

� for 8086 emulation

IA-32: segmented memory model

seg selector offset16 bit32 bit

accessbase address

limit

Segment Descriptor Table

(Global or Local)

seg register

segment descriptor

13

uses bits (15:3)

+ Linear address

GDTR or LDTR

32 bit

linear address

segment_selector : offset � logical address

IA-32: segmented memory model

0

232-1

CS

IP

16 bit

32 bit

code segment

accessbase address

limit

accessbase address

limit

data segment

DS

16 bit

address32 bit

Linear address space

Segment Descriptor Table13

13

IA-32: flat memory model

0

232-1

Linear address space

Segment Descriptor Table

16 bit

13

13

CS

16 bit

DS

16 bit

ES

16 bit

FS

16 bit

GS

16 bit

SS

13

13

13

13

accessbase address

limit

IA-32: segment registers

Index TI RPL

15 3 2 1 0

TI: 0 GDT - 1 LDT

RPL: requested privilege level

Segment selector Access information, Limit, Base address

Segment selector Access information, Limit, Base address

Segment selector Access information, Limit, Base address

Segment selector Access information, Limit, Base address

Segment selector Access information, Limit, Base address

Segment selector Access information, Limit, Base address

CS

DS

ES

FS

GS

SS

Segment selector

Segment registers

Visible portion Hidden portion (shadow register)

2-bits: current privilege level (CPL)

IA-32: Segment Descriptors

63 62 61 60 59 78 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32

BASE 31:24 GD

/

BL

A

V

LLIMIT 19:16 P DPL S TYPE BASE 23:16

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

BASE 15:0 LIMIT 15:0

L: 64-bit code segment

AVL: available for system software

BASE: base address

D/B: code segment: Default operation size (16 (0) or 32 (1) bit)

data segment: address size for stack access (16 bit (0) or 32 bit (1 Big))

DPL: descriptor privilege level

G: granularity (byte (0) or page (1))

LIMIT: segment size (bytes if G=0, 4KB pages if G=1)

P: present

S: type (0=system; 1=code or data)

TYPE: segment type (data, code, stack, read/write, ...)

IA-32: Segment Descriptors

0 E W A

� Type field

Data segment descriptorE: Expansion direction

0: valid addresses from 0 to Limit

1: valid addresses from Limit to maximum (for stacks)

W: Writable

A: Accessed

1 C R A Code segment descriptorC: Conforming

R: Readable

A: Accessed

IA-32: Segment Descriptors

� System descriptors

� System segment descriptors

� LDT: Local Descriptor-Table segment descriptor

� TSS: Task-state segment descriptor

� Function pointers

� Call-gate descriptor

� Interrupt-gate descriptor

� Trap-gate descriptor

� Indirect (protected) reference to a TSS

� Task-gate descriptor

IA-32: Segment Tables

seg. descriptor

LDT

TSS

seg. descriptor

Global

Descriptor

Table

Global

Code/Data

Segment

Task

State

Struct

Local

Code/Data

Segment

Local

Descriptor

Table

GDTR

LDTR

TR

Task Gate

IA-32: Segment Tables

Task Gate

Trap GateInterrupt

Descriptor

Table

IDTR

Interrupt Gate

to a TSS selector in GDT or LDT

“pointer” and access rights

“pointer” and access rights

63 62 61 60 59 78 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32

Offset 31:16 P DPL 0 D 1 1 0type

0 0 0 reserved

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

Segment selector Offset 15:0

63 62 61 60 59 78 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32

Offset 31:16 P DPL 0 D 1 1 1type

0 0 0 reserved

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

Segment selector Offset 15:0

Interrupt Gate

Trap Gate

IA-32: Segment Tables

� First entry in GDT is not used

� accessing the #0 segment � General Protection error

� GDT descriptors

� Code/Data segment descriptors

� System descriptors

� LDT

� TSS

� Task/Call/Interrupt/Trap gates

IA-32: Segment Tables

� LDT descriptors

� Code/Data segment descriptors

� System descriptors

� Task/Call/Interrupt/Trap gates

� IDT descriptors

� System descriptors

� Task/Interrupt/Trap gates

Segment Descriptor Tables location

Linear Base Address Table Limit

47 16 15 0

Segment Selector Linear Base Address Table Limit

47 16 15 015 0

GDTR and IDTR format

LDTR and TR format

LDTR and TR are reloaded whenever a task switch happens

GDTR: Global Descriptor Table Register

IDTR: Interrupt Descriptor Table Register

LDTR: Local Descriptor Table Register

TR: Task Register

IDT: table of gate descriptors (“access entries”

for interrupt management)

TR: info about current TSS (task state segment)

TSS: data structure with task informations

Paging

� Enabled with control bits

� CR0.PG=1 and CR0.PE=1

� 3 paging modes

� 32-bit paging

� PAE paging

� IA-32e paging

Paging

� Enabled with control bits

� CR0.PG=1 and CR0.PE=1

� 3 paging modes

� 32-bit paging

� Linear address: 32 bits

� Physical address: 32 bits

� Up to 40 bits when 4 MB pages are used and PSE-36 is supported

� Page size: 4 KB or 4 MB PSE-36: page-size extensions with

40-bit physical-address extension.

Paging

� Enabled with control bits

� CR0.PG=1 and CR0.PE=1

� 3 paging modes

� PAE paging

� Linear address: 32 bits

� Physical address: up to 52 bits

� Page size: 4 KB or 2 MB

Paging

� Enabled with control bits

� CR0.PG=1 and CR0.PE=1

� 3 paging modes

� IA-32e paging

� Linear address: 48 bits

� Physical address: up to 52 bits

� Page size: 4 KB, 2 MB, or 1 GB

Paging

Page Directory address ignored

PWT

ignored

PCD

31 12 11 5 4 3 2 0

Control Register 3 (CR3) format

PCD: Page Cache Disable

PWT: Page Write Through

Caching attributes:

if PAT is not supported: use bits PCD and PWT in CR3

if PAT is supported: use bits PCD and PWT in PDEs and PTEs

PAT: Page Attribute TablePentium III and later

32-bit paging

PDE

PTE

Table address (CR3)

Page Directory

Page Table

31 12 11 0

12

1010

20

Physical address

20

00

2

Linear address

2

00

Physical address of

page directory entry

Physical address of

page table entry

20

2122

32-bit paging

PDE

Table address (CR3)

Page Directory

31 0

20

10

Physical address

20

Linear address

2

00

Physical address of

page directory entry

20

2122

Protection levels

Level 0

Level 1

Level 2

Level 3

Applications

Operating System Services

Operating System Kernel

Protection Rings

Protection levels

� CPL: Current privilege level� stored in CS

� privilege level of the current task

� DPL: Descriptor privilege level� stored in segment descriptor� Data segments, Call gates, TSS

� maximum privilege level required to access a segment

� Nonconforming code segment� exact privilege level required to access a segment

� Conforming code segment� minimum privilege level required to access a segment

� RPL: Requested privilege level� stored in segment selector� overrides CPL if higher

RPL usage

segment table

memory segment

0

3

CPL

DPL

unprivileged code

RPL

access denied: CPL > DPL

application tries to access

to a protected segment

RPL usage

segment table

memory segment

0

3

CPL

DPL

unprivileged code

RPL

0

CPL

privileged code

RPL

syscall

access allowed: CPL = DPL

Wrong: protection is violated

application passes a pointer

to a protected segment to

kernel code

RPL usage

segment table

memory segment

0

3

CPL

DPL

unprivileged code

0RPL

0

CPL

privileged code

3RPL

syscall

access denied: RPL > DPL

application passes a pointer

to a protected segment to

kernel code

but

operating system adjusts RPL to

match the caller privilege level

Exception Handling

� Exceptions

� Interrupts

� External interrupts

� Maskable interrupts

� APIC (Advanced Programmable Interrupt Controller) receives

interrupts (signals or messages) and maps them to interrupt

vectors (integer 0-255)

� NMI

� NMI pin or APIC NMI message

� Software-generated interrupts

� Instruction: INT x (x: integer 0-255)

� Actually a software exception

two bytes instruction:

INT n: 0xCD n

Exception Handling

� Exceptions

� Exceptions

� Program-Error Exceptions

� Faults:

� task can be restarted after exception handling� return address: faulting instruction

� Traps

� task can be restarted after exception handling� return address: instruction that follows the trapping one

� Aborts

� precise informations missing� severe errors (hw errors, wrong values in system tables, ...)� program resuming not possible

Exception Handling

� Exceptions

� Exceptions

� Software-Generated Exceptions

� Instructions: BOUND, INTO, INT3

� Machine-Check Exceptions

� Internal machine error, Bus error, Implementation dependent

checks

single byte instructions:

INTO: 0xCE

INT3: 0xCC

Exception Handling

IDTIDTR

Interrupt or Trap

Gate

GDT or LDT

Code segment (conforming)

Exception

procedure

Segment descriptor

+

0

255

i

CPL

Exception procedure is executed at the current

privilege level

DPL of gate is checked only for INT, INT3, or INTO

instructions

Exception Handling

IDTIDTR

Interrupt or Trap

Gate

GDT or LDT

Code segment (nonconforming)

Exception

procedure

Segment descriptor

+

0

255

i

CPLDPL

Exception procedure is executed at the privilege

level indicated in the segment descriptor

(with a stack switch if CPLold

> CPLnew

)

DPL of gate is checked only for INT, INT3, or INTO

instructions

Exception Handling

IDTIDTR

Task Gate

GDT or LDT

Task State Struct

TSS descriptor

0

255

i

Exception causes a task switch

DPL of gate is checked only for INT, INT3, or INTO

instructions

Exception Handling

� Software generated interrupts/exceptions

� INT n

� push flags and return address on stack� return address: instruction following "INT n"

� jump to interrupt vector n (use entry n of IDT)

� INT3

� similar to INT n� jump to interrupt vector 3

� INTO

� similar to INT n� generate exception if EFLAGS.OF=1� jump to interrupt vector 4

Exception Handling

� External interrupts

� Maskable interrupts

� APIC maps interrupt to interrupt vector x� push flags and return address on stack� return address: instruction following the interrupted one

� interrupted instruction is completed

� jump to interrupt vector x (use entry x of IDT)

� HW should use vectors 32-255

� NMI

� similar to maskable interrupts� use interrupt vector 2

Exception Handling

� Faults

� faulting instruction is aborted

� push flags and return address on stack� return address: faulting instruction

� Divide Error

� a DIV or IDIV instruction attempted to divide by 0

� jump to interrupt vector 0

� Debug Exception

� HW breakpoint on I-fetch

� jump to interrupt vector 1

Exception Handling

� Faults

� BOUND

� a BOUND instruction revealed a index out of array bounds

� jump to interrupt vector 5

� Invalid Opcode

� invalid instruction or operand

� jump to interrupt vector 6

� Device not available

� access to x87 or SSE/SSE2/SSE3 unit while not ready

� jump to interrupt vector 7

Exception Handling

� Faults

� Invalid TSS

� push an error code on stack (segment selector index)

� jump to interrupt vector 10

� Segment not present

� push an error code on stack (segment selector index)

� jump to interrupt vector 11

� Stack fault

� push an error code on stack (segment selector index or 0)

� jump to interrupt vector 12

Exception Handling

� Faults

� General protection

� push an error code on stack (segment selector index or 0)

� jump to interrupt vector 13

� Page fault

� push an error code and the faulting address on stack

� jump to interrupt vector 14

� x87 floating point error

� x87 stack overflow or IEEE-754 trap

� jump to interrupt vector 16

Exception Handling

� Faults

� Alignment check

� alignment check enabled and unaligned memory reference

� push an error code (0) on stack

� jump to interrupt vector 17

� SIMD floating point error

� Invalid operation

� Divide-by-zero, Denormal operand, Numeric overflow,

Numeric underflow, Inexact result

� jump to interrupt vector 19

Exception Handling

� Traps

� Trapping instruction is executed

� Push flags and return address on stack

� return address: instruction that follows the trap

� Debug Exception

� HW breakpoint on memory or IO read or write

� Single step debugging enabled

� Task switch debug enabled

� jump to interrupt vector 1

Exception Handling

� Traps

� Breakpoint Exception

� an INT3 instruction is executed

� jump to interrupt vector 3

� Overflow Exception

� an INTO instruction is executed and EFLAGS.OF=1

� jump to interrupt vector 4

Exception Handling

� Aborts

� Push flags and return address on stack

� return address: undefined

� Double Fault Exception

� An exception occurred inside an exception handler and

serial exception management is not feasible� e.g., page fault when handling a page fault

� push 0 (error code) on stack

� jump to interrupt vector 8

� Machine Check Exception

� jump to interrupt vector 18

IA-32: data types

� Integer

� byte

� word (16-bit)

� dword (32-bit) (double word)

� qword (64-bit) (quad word)

� dqword (128-bit) (double quad word)

� BCD

� Floating point

� single precision (32-bit: 1+8+23+1)

� double precision (64-bit: 1+11+52+1)

� double extended precision (80-bit: 1+15+64)

IA-32: memory addressing

EAX

EBX

ECX

EDX

ESP

EBP

ESI

EDI

EAX

EBX

ECX

EDX

ESP

EBP

ESI

EDI

1

2

4

8

*

none

8-bit

16-bit

32-bit

+ ( ) +

base index scale displacement

address =

e.g.: MOV EAX, [EBX+4*EDI+0xA000]

IA-32: instruction format

lock and repeat

segment override

operand size

address size

32-bit addressing formats with the mod and R/M fields

modreg/

opcode7 6 5 4 3 2 1 0

Scale Index

7 6 5 4 3 2 1 0

R/M Base

mod

R/M 00 01 10 11

000 [EAX] [EAX]+ disp8 [EAX]+ disp32 EAX/AX/AL/MM0/XMM0

001 [ECX] [ECX]+ disp8 [ECX]+ disp32 ECX/CX/CL/MM/XMM1

010 [EDX] [EDX]+ disp8 [EDX]+ disp32 EDX/DX/DL/MM2/XMM2

011 [EBX] [EBX]+ disp8 [EBX]+ disp32 EBX/BX/BL/MM3/XMM3

100 [--][--]+ disp8 [--][--]+ disp32 ESP/SP/AH/MM4/XMM4

101 disp32 [EBP]+ disp8 [EBP]+ disp32 EBP/BP/CH/MM5/XMM5

110 [ESI] [ESI]+ disp8 [ESI]+ disp32 ESI/SI/DH/MM6/XMM6

111 [EDI] [EDI]+ disp8 [EDI]+ disp32 EDI/DI/BH/MM7/XMM7

[--][--] SIB

reg/opcode: 2nd operand (if any)

000: EAX/AX/AL/MM0/XMM0

001: ECX/CX/CL/MM/XMM1

010: EDX/DX/DL/MM2/XMM2

011: EBX/BX/BL/MM3/XMM3

100: ESP/SP/AH/MM4/XMM4

101: EBP/BP/CH/MM5/XMM5

110: ESI/SI/DH/MM6/XMM6

111: EDI/DI/BH/MM7/XMM7

prefix

0-4 1-3 0-1 0-1 0-4 0-4

opcode mode SIB displacement immediate

IA-32: instruction format

lock and repeat

segment override

operand size

address size

mod

R/M 00 01 10 11

000 [BX+SI] [BX+SI] + disp8 [BX+SI] + disp16 EAX/AX/AL/MM0/XMM0

001 [BX+DI] [BX+DI] + disp8 [BX+DI] + disp16 ECX/CX/CL/MM/XMM1

010 [BP+SI] [BP+SI] + disp8 [BP+SI] + disp16 EDX/DX/DL/MM2/XMM2

011 [BP+DI] [BP+DI] + disp8 [BP+DI] + disp16 EBX/BX/BL/MM3/XMM3

100 [SI] [SI] + disp8 [SI] + disp16 ESP/SP/AH/MM4/XMM4

101 [DI] [DI] + disp8 [DI] + disp16 EBP/BP/CH/MM5/XMM5

110 disp16 [BP] + disp8 [BP] + disp16 ESI/SI/DH/MM6/XMM6

111 [BX] [BX] + disp8 [BX] + disp16 EDI/DI/BH/MM7/XMM7

16-bit addressing formats with the mod and R/M fields

reg/opcode: 2nd operand (if any)

000: EAX/AX/AL/MM0/XMM0

001: ECX/CX/CL/MM/XMM1

010: EDX/DX/DL/MM2/XMM2

011: EBX/BX/BL/MM3/XMM3

100: ESP/SP/AH/MM4/XMM4

101: EBP/BP/CH/MM5/XMM5

110: ESI/SI/DH/MM6/XMM6

111: EDI/DI/BH/MM7/XMM7

modreg/

opcode7 6 5 4 3 2 1 0

Scale Index

7 6 5 4 3 2 1 0

R/M Base

prefix

0-4 1-3 0-1 0-1 0-4 0-4

opcode mode SIB displacement immediate

IA-32: instruction set

� General-purpose instructions� Basic data movement, arithmetic, logic, program flow, and string operations� Operate on data contained in memory, in the GP registers (EAX, EBX, ECX,

EDX, EDI, ESI, EBP, and ESP) and in the EFLAGS register

� FPU instructions� FP and BCD operations

� SIMD extensions� MMX instructions

� operate on packed byte, word, doubleword, or quadword integer operands in memory,

MMX registers, and/or in GP registers.

� SSE instructions� operate on packed and scalar single-precision floating-point values located in XMM

registers and/or memory.

� SSE2 instructions� operate on packed double-precision floating-point operands and on packed byte, word,

doubleword, and quadword operands located in the XMM registers.

� SSE3 instructions

IA-32: instruction set

� FPU and SIMD state management instructions� System instructions

� processor's functions needed to support operating systems and protection

� 64-bit mode instructions

IA-32:

General-purpose instructions� Data Transfer Instructions

MOV Move data between general-purpose registers; move data between memory and general-purpose or

segment registers; move immediates to general-purpose registers

CMOVE/CMOVZ Conditional move if equal/Conditional move if zero

CMOVNE/CMOVNZ Conditional move if not equal/Conditional move if not zero

CMOVA/CMOVNBE Conditional move if above/Conditional move if not below or equal

CMOVAE/CMOVNB Conditional move if above or equal/Conditional move if not below

CMOVB/CMOVNAE Conditional move if below/Conditional move if not above or equal

CMOVBE/CMOVNA Conditional move if below or equal/Conditional move if not above

CMOVG/CMOVNLE Conditional move if greater/Conditional move if not less or

CMOVGE/CMOVNL Conditional move if greater or equal/Conditional move if not less

CMOVL/CMOVNGE Conditional move if less/Conditional move if not greater or equal

CMOVLE/CMOVNG Conditional move if less or equal/Conditional move if not greater

CMOVC Conditional move if carry

CMOVNC Conditional move if not carry

CMOVO Conditional move if overflow

CMOVNO Conditional move if not overflow

CMOVS Conditional move if sign (negative)

CMOVNS Conditional move if not sign (non-negative)

CMOVP/CMOVPE Conditional move if parity/Conditional move if parity even

CMOVNP/CMOVPO Conditional move if not parity/Conditional move if parity odd

IA-32:

General-purpose instructions� Data Transfer Instructions

XCHG Exchange

BSWAP Byte swap

XADD Exchange and add

CMPXCHG Compare and exchange

CMPXCHG8B Compare and exchange 8 bytes

PUSH Push onto stack

POP Pop off of stack

PUSHA/PUSHAD Push general-purpose registers onto stack

POPA/POPAD Pop general-purpose registers from stack

CWD/CDQ Convert word to doubleword/Convert doubleword to quadword

CBW/CWDE Convert byte to word/Convert word to doubleword in EAX register

MOVSX Move and sign extend

MOVZX Move and zero extend

IA-32:

General-purpose instructions� Binary Arithmetic Instructions

ADD Integer add

ADC Add with carry

SUB Subtract

SBB Subtract with borrow

IMUL Signed multiply

MUL Unsigned multiply

IDIV Signed divide

DIV Unsigned divide

INC Increment

DEC Decrement

NEG Negate

CMP Compare

IA-32:

General-purpose instructions� Decimal Arithmetic Instructions

DAA Decimal adjust after addition

DAS Decimal adjust after subtraction

AAA ASCII adjust after addition

AAS ASCII adjust after subtraction

AAM ASCII adjust after multiplication

AAD ASCII adjust before division

� Logical Instructions

AND Perform bitwise logical AND

OR Perform bitwise logical OR

XOR Perform bitwise logical exclusive OR

NOT Perform bitwise logical

IA-32:

General-purpose instructions� Shift and Rotate Instructions

SAR Shift arithmetic right

SHR Shift logical right

SAL/SHL Shift arithmetic left/Shift logical left

SHRD Shift right double

SHLD Shift left double

ROR Rotate right

ROL Rotate left

RCR Rotate through carry right

RCL Rotate through carry left

IA-32:

General-purpose instructions� Bit and Byte Instructions

BT Bit test

BTS Bit test and set

BTR Bit test and reset

BTC Bit test and complement

BSF Bit scan forward

BSR Bit scan reverse

SETE/SETZ Set byte if equal/Set byte if zero

SETNE/SETNZ Set byte if not equal/Set byte if not zero

SETA/SETNBE Set byte if above/Set byte if not below or equal

SETAE/SETNB/SETNC Set byte if above or equal/Set byte if not below/Set byte if not carry

SETB/SETNAE/SETC Set byte if below/Set byte if not above or equal/Set byte if carry

SETBE/SETNA Set byte if below or equal/Set byte if not above

SETG/SETNLE Set byte if greater/Set byte if not less or equal

SETGE/SETNL Set byte if greater or equal/Set byte if not

SETL/SETNGE Set byte if less/Set byte if not greater or equal

SETLE/SETNG Set byte if less or equal/Set byte if not greater

IA-32:

General-purpose instructions� Bit and Byte Instructions

SETS Set byte if sign (negative)

SETNS Set byte if not sign (non-negative)

SETO Set byte if overflow

SETNO Set byte if not overflow

SETPE/SETP Set byte if parity even/Set byte if parity

SETPO/ SETNP Set byte if parity odd/Set byte if not parity

TEST Logical compare

IA-32:

General-purpose instructions� Control Transfer Instructions

JMP Jump

JE/JZ Jump if equal/Jump if zero

JNE/JNZ Jump if not equal/Jump if not zero

JA/JNBE Jump if above/Jump if not below or equal

JAE/JNB Jump if above or equal/Jump if not below

JB/JNAE Jump if below/Jump if not above or equal

JBE/JNA Jump if below or equal/Jump if not above

JG/JNLE Jump if greater/Jump if not less or equal

JGE/JNL Jump if greater or equal/Jump if not less

JL/JNGE Jump if less/Jump if not greater or equal

JLE/JNG Jump if less or equal/Jump if not greater

JC Jump if carry

JNC Jump if not carry

JO Jump if overflow

JNO Jump if not overflow

IA-32:

General-purpose instructions� Control Transfer Instructions

JS Jump if sign (negative)

JNS Jump if not sign (non-negative)

JPO/JNP Jump if parity odd/Jump if not

JPE/JP Jump if parity even/Jump if parity

JCXZ/JECXZ Jump register CX zero/Jump register ECX zero

LOOP Loop with ECX counter

LOOPZ/LOOPE Loop with ECX and zero/Loop with ECX and equal

LOOPNZ/LOOPNE Loop with ECX and not zero/Loop with ECX and not equal

CALL Call procedure

RET Return

IRET Return from interrupt

INT Software interrupt

INTO Interrupt on overflow

BOUND Detect value out of range

ENTER High-level procedure entry

LEAVE High-level procedure exit

IA-32:

General-purpose instructions� String Instructions

MOVS/MOVSB Move string/Move byte string

MOVS/MOVSW Move string/Move word string

MOVS/MOVSD Move string/Move doubleword string

CMPS/CMPSB Compare string/Compare byte string

CMPS/CMPSW Compare string/Compare word string

CMPS/CMPSD Compare string/Compare doubleword string

SCAS/SCASB Scan string/Scan byte string

SCAS/SCASW Scan string/Scan word string

SCAS/SCASD Scan string/Scan doubleword string

LODS/LODSB Load string/Load byte string

LODS/LODSW Load string/Load word string

LODS/LODSD Load string/Load doubleword string

STOS/STOSB Store string/Store byte string

STOS/STOSW Store string/Store word

STOS/STOSD Store string/Store doubleword string

REP Repeat while ECX not zero

REPE/REPZ Repeat while equal/Repeat while zero

REPNE/REPNZ Repeat while not equal/Repeat while not zero

IA-32:

General-purpose instructions� I/O Instructions

IN Read from a port

OUT Write to a port

INS/INSB Input string from port/Input byte string from port

INS/INSW Input string from port/Input word string from port

INS/INSD Input string from port/Input doubleword string from port

OUTS/OUTSB Output string to port/Output byte string to port

OUTS/OUTSW Output string to port/Output word string to port

OUTS/OUTSD Output string to port/Output doubleword string to port

� Segment Register Instructions

LDS Load far pointer using DS

LES Load far pointer using ES

LFS Load far pointer using FS

LGS Load far pointer using GS

LSS Load far pointer using SS

IA-32:

General-purpose instructions� Flag Control (EFLAG) Instructions

STC Set carry flag

CLC Clear the carry flag

CMC Complement the carry flag

CLD Clear the direction flag

STD Set direction flag

LAHF Load flags into AH

SAHF Store AH register into flags

PUSHF/PUSHFD Push EFLAGS onto stack

POPF/POPFD Pop EFLAGS from stack

STI Set interrupt flag

CLI Clear the interrupt flag

� Miscellaneous Instructions

LEA Load effective address

NOP No operation

UD2 Undefined instruction

XLAT/XLATB Table lookup translation

CPUID Processor Identification

IA-32:

System instructionsLGDT Load global descriptor table (GDT) register

SGDT Store global descriptor table (GDT) register

LLDT Load local descriptor table (LDT) register

SLDT Store local descriptor table (LDT) register

LTR Load task register

STR Store task register

LIDT Load interrupt descriptor table (IDT) register

SIDT Store interrupt descriptor table (IDT) register

MOV Load and store control registers

LMSW Load machine status

SMSW Store machine status word

CLTS Clear the task-switched flag

ARPL Adjust requested privilege level

LAR Load access rights

LSL Load segment limit

VERR Verify segment for reading

VERW Verify segment for writing

MOV Load and store debug registers

INVD Invalidate cache, no writeback

WBINVD Invalidate cache, with writeback

IA-32:

System instructionsINVLPG Invalidate TLB Entry

LOCK (prefix) Lock Bus

HLT Halt processor

RSM Return from system management mode (SMM)

RDMSR Read model-specific register

WRMSR Write model-specific register

RDPMC Read performance monitoring counters

RDTSC Read time stamp counter

SYSENTER Fast System Call, transfers to a flat protected mode kernel at CPL = 0

SYSEXIT Fast System Call, transfers to a flat protected mode kernel at CPL = 3

IA-32: info

� IA-32 Intel Architecture Software Developer's Manual

� Volume 1: Basic Architecture

� Volume 2A: Instruction Set Reference, A-M

� Volume 2B: Instruction Set Reference, N-Z

� Volume 3: System Programming Guide