90
BBM 3622- Microprocessors 1 A Historical Background • The idea of calculating with a machine dates to before 500 B.C. when the Babylonians invented the abacus, the first mechanical calculator.

A Historical Background

  • Upload
    micheal

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

A Historical Background. The idea of calculating with a machine dates to before 500 B.C. when the Babylonians invented the abacus , the first mechanical calculator. Blaise Pascal (1623-1662). - PowerPoint PPT Presentation

Citation preview

Page 1: A Historical Background

BBM 3622- Microprocessors 1

A Historical Background

• The idea of calculating with a machine dates to before 500 B.C. when the Babylonians invented the abacus, the first mechanical calculator.

Page 2: A Historical Background

BBM 3622- Microprocessors 2

Blaise Pascal (1623-1662) The abacus was not improved until 1642, when

Blaise Pascal invented a calculator constructed a gear and wheels.

Page 3: A Historical Background

BBM 3622- Microprocessors 3

Charles Babbage One early pioneer of mechanical computing machinery was

Charles Babbage and produce a programmable calculating machine in 1823. He create the “Analytical Engine”. This machine was a mechanical computer that stored 1000 20-digit decimal numbers and variable program that could modify the function of the machine.

Page 4: A Historical Background

BBM 3622- Microprocessors 4

In 1889, Herman Hollerith developed the punched card for storing data and also developed a mechanical machine-driven by one of the new electric motors. He was the former of IBM Corporation.

Herman Hollerith

Page 5: A Historical Background

BBM 3622- Microprocessors 5

Konrad Zuse

The first electronic calculating machine invented in 1941 by Konrad Zuse. He had released the first programmable computer designed to solve complex engineering equations. It was also the first machine to work on the binary system, as opposed to the more familiar decimal system. His calculating computer was used in aircraft and missile design

during World War II for the German war effort.

Page 6: A Historical Background

BBM 3622- Microprocessors 6

Binary System

Page 7: A Historical Background

BBM 3622- Microprocessors 7

The first truly electronic computer was places into operation in 1943 to break secret German military codes. The first electronic computer system, which used vacuum tubes, was invented by Alan Turing who is a British mathematician . Turing called this machine Colossus, most likely because its size. A problem with Colossus was that although its design allowed it to break secret German military codes generated by mechanical Enigma machine, it could not solve other problems. Colossus was not programmable- it was a fixed program computer system.

Alan Turing

Page 8: A Historical Background

BBM 3622- Microprocessors 8

Sample Turing Machine

Page 9: A Historical Background

BBM 3622- Microprocessors 9

The first general purpose programmable computer system was developed in 1946 and called ENIAC. The ENIAC was a huge machine(30 tons) and performed about 100 000 operation per second.

ENIAC

Page 10: A Historical Background

BBM 3622- Microprocessors 10

John von Neumann In 1945, Von Neumann

contributed a new understanding of how practical fast computers should be organized and built; these ideas, often referred to as the stored-program technique, became fundamental for future generations of high-speed digital computers and were universally adopted.

Page 11: A Historical Background

BBM 3622- Microprocessors 11

Transistor

Page 12: A Historical Background

BBM 3622- Microprocessors 12

INTEL 4004

The development of transistor in 1948, followed by the invention of the integrated circuits in 1958. In 1971, the first microprocessor Intel 4004 was developed. 4004 was a 4-bit microprocessor and instruction set contains 45 instruction. It performed about 50 000 instruction per second.

Page 13: A Historical Background

BBM 3622- Microprocessors 13

In 1978, Intel released the 8086 microprocessor which was 16-bit microprocessor and performed 2.5 million instruction per second.

This microprocessor were called CISC(Complex Instruction Set Computers) because of the number and complexity of instructions.

The popularity of Intel family was ensured in 1981 when IBM Corp. decided to use 8088/8086 microprocessors in its personal computers.

Intel 8086

Page 14: A Historical Background

BBM 3622- Microprocessors 14

Intel 8086/8088 Microprocessors

Intel 8086 and 8088 Microprocessors are the basis of all IBM-PC compatible computers(8086 introduced in 1978, first IBM-PC released in 1981)

All Intel, AMD and other advanced microprocessors are based on and are compatible with the original 8086/8

At Power Up and Reset time, Pentiums, Athlons etc all look like 8086 processors

Page 15: A Historical Background

BBM 3622- Microprocessors 15

Intel 8086/8088 Microprocessors

Intel 8086 is a 16-bit microprocessor 16-bit data registers 16 or 8 bit external data bus Some techniques to optimise the CPU

performance when it’s executing programs

Segment: Offset memory model Little-Endian Data Format

Page 16: A Historical Background

BBM 3622- Microprocessors 16

8086/8088 (1)

Original IBM PC used 8088 micrprocessor 8088 is similar to the 8086 microprocessor

but it has an external 8-bit bus & only 4-deep queue For cost reduction reasons

We can consider 8086 and 8088 together PC clones often used 8086 for better

performance 8-bit bus reduces performance, but meant

cheaper computers

Page 17: A Historical Background

BBM 3622- Microprocessors 17

8086/8088 (2)

Remember the Fetch-Decode-Execute cycle?

Fetching from EXTERNAL MEMORY is SLOW The 8086/8 used an instruction queue to

speed up performance While the processor is decoding and

executing an instruction, its bus interface can be reading new instructions, since at that time the bus is not actually in use

Page 18: A Historical Background

BBM 3622- Microprocessors 18

8086/8088 Functional Units

Execution Unit(EU)

Bus InterfaceUnit(BIU)

Fetches Opcodes,Reads Operands,

Writes Data

8086/8088 MPU

Page 19: A Historical Background

BBM 3622- Microprocessors 19

8086/8088 (3)

8086/8088 consists of two internal units The execution unit (EU) - executes the

instructions The bus interface unit (BIU) - fetches

instructions, reads operands and writes results

The 8086 has a 6-byte prefetch queue The 8088 has a 4-byte prefetch queue

Page 20: A Historical Background

BBM 3622- Microprocessors 20

8086/8088 Internal Organisation

TemporaryRegisters

ALU

Flags

EUControl

AH AL

BH BL

CH CL

DH DL

SP

BP

DI

BI

CS

DS

SS

ES

IO

InternalCommunications

Registers

SUMMATION

Address Bus 20 bits

Data Bus

BusControl

1 2 3 4

Instruction Queue

8088Bus

EU BIU

Page 21: A Historical Background

BIU Elements Instruction Queue: the next instructions or data

can be fetched from memory while the processor is executing the current instruction

The memory interface is slower than the processor execution time so this speeds up overall performance 

Segment Registers: CS, DS, SS and ES are 16-bit registers Used with the 16-bit Base registers to generate the 20-

bit address Allow the 8086/8088 to address 1Mb of memory Changed under program control to point to different

segments as a program executes Instruction Pointer (IP) contains the Offset

Address of the next instruction, the distance in bytes from the address given by the current CS register

Page 22: A Historical Background

BBM 3622- Microprocessors 22

8086/8088 20-bit Addresses

16-bit Segnment Base Address 0000

16-bit Offset Address

20-bit Physical Address

CS

IP

Page 23: A Historical Background

BBM 3622- Microprocessors 23

Exercise: 20-bit Addressing

Memory00000h

123A0h

157BBh

2239Fh

223A0h

341Bh

Range of Code Segment

CS=123Ah

IP=341Bh

Page 24: A Historical Background

BBM 3622- Microprocessors 24

Exercise: 20-bit Addressing

1. CS contains 0A820h,IP contains 0CE24h. What is the resulting physical address?

2. CS contains 0B500h, IP contains 0024h. What is the resulting physical address?

Page 25: A Historical Background

BBM 3622- Microprocessors 25

Segment Registers

The utilization of the segment registers

essentially divides the memory space into

overlapping segments, with each segment

being 64K bytes long and at an address that

is divisible by 16.

Page 26: A Historical Background

BBM 3622- Microprocessors 26

The advantage of using segment registers

Allow the memory capacity to be 1 M Byte even though the addresses associated with the individual instructions are only 16 bits wide.

Allow the instruction, data or stack portion of a program to be more than 64K Bytes long by allowing more than one code, data or stack segment.

Facilitate the use of separate memory areas for a program, its data and the stack.

Permit a program and/or its data to be put into different areas of memory each time the program is executed.

Page 27: A Historical Background

BBM 3622- Microprocessors 27

8086/8 In Circuit (1)

8086/8 microprocessors need support circuits in a microcomputer system

8086/8 multiplex the address and data buses on the same pins

This saves pins but at a price: Demultiplexing logic is needed to build up

separate address and data buses to interface with RAMs and ROMs

Page 28: A Historical Background

BBM 3622- Microprocessors 28

MAXIMUMMODE

MINIMUMMODE

1 40

20 21

8086

GND

AD14

AD13

AD12

AD11

AD10

AD9

AD8

AD7

AD6

AD5

AD4

AD3

AD2

AD1

AD0

NMI

INTR

CLK

GND

Vcc

AD15

A16,S3

A17,S4

A18,S5

A19,S6

/BHE,S7

MN,/MX

/RD

/RQ,/GT0

/LOCK

/S2

/S1

/S0

QS0

QS1

/TEST

READY

RESET

/RQ,/GT1

HOLD

/WR

IO/M

DT/R

/DEN

ALE

/INTA

HLDA

Page 29: A Historical Background

BBM 3622- Microprocessors 29

Pin Connections

AD15-AD0: (I/O-3)

The 8086 address/data bus lines compose

the upper multiplexed address/data bus on

8086. These lines contains address bits

whenever ALE is logic 1. These pins enter

a high-impedance state whenever a hold

acknowledge occurs.

Page 30: A Historical Background

BBM 3622- Microprocessors 30

A19/S6-A16/S3: (O-3)

The address/status bus bits are multiplexed to

provide address signals A19-A16 and also status

bits S6-S3. The pins also attain a high-impedance

state during the hold acknowledge. S4 and S3

show which segment is accessed during the

current bus cycle.

Pin Connections

Page 31: A Historical Background

BBM 3622- Microprocessors 31

Pin Connections

S4 S3 Function

0 0 Extra segment

0 1 Stack segment

1 0 Code or no segment

1 1 Data segment

Page 32: A Historical Background

BBM 3622- Microprocessors 32

Pin Connections

: (O-3)Whenever the read signal is logic 0, the data bus is receptive to data from the memory or I/O devices connected to system.

READY: (I)This input is controlled to insert wait states intothe timing of the microprocessor. READY=0: P enters into wait states and remain idleREADY=1: It has no effect on operation of P

RD

Page 33: A Historical Background

BBM 3622- Microprocessors 33

Pin Connections

: (I)The test pin is an input that is tested by the WAIT instruction.

NMI: (I)The non-maskable interrupt input is similar to INTR except that the NMI does not check to see if IF flag bit is a logic 1. This interrupt input uses interrupt vector 2.

TEST

Page 34: A Historical Background

BBM 3622- Microprocessors 34

Pin Connections

7/ SBHE

RESET: (I)The reset input causes the P to reset itself if this pin is held high for a minimum four clocking periods. It begins executing instructions at memory locationFFFF0H and disables future interrupts by clearing the IF flag bit.

: (I)Minimum/maximum mode pin select.

: (O-3)BHE pin is used to enable the most sig. data bus bits (D15-D8) during a read or write operation.

MXMN /

Page 35: A Historical Background

BBM 3622- Microprocessors 35

Minimum mode Pins

IOM / : (O-3)

The pin selects memory or I/O. This pin indicates that the microprocessor address bus contains either a memory address or an I/O port address.

: (O-3)

This line indicates that 8086 is outputting data to a memory or I/O device.

WR

Page 36: A Historical Background

BBM 3622- Microprocessors 36

Minimum mode Pins

: (O-3)The interrupt acknowledge signal is a response

to the INTR input pin. This pin is normally used to gate the interrupt vector number onto the data bus in response to an interrupt request.

: (O)Address latch enable shows that the 8086

address/data bus contains address information. This address can be a memory address or an I/O port number.

INTA

ALE

Page 37: A Historical Background

BBM 3622- Microprocessors 37

Minimum mode Pins

:(0-3)

The data transmit/receive signal shows that the microprocessor data bus is transmitting or receiving data.

: (O-3)

Data bus enable activates external data bus buffers.

RDT /

DEN

Page 38: A Historical Background

BBM 3622- Microprocessors 38

HOLD : (I)The hold input requests a direct memory

access (DMA). If the HOLD signal is logic 1, the microprocessor stops executing software and places its address, data and control bus at the high- impedance state.

HLDA : (O)Hold acknowledge indicates that the 8086 microprocessor entered the hold state.

Minimum mode Pins

Page 39: A Historical Background

BBM 3622- Microprocessors 39

Maximum mode Pins

In order to achieve maximum mode for use with external coprocessors or multiprocessing applications.

(O)The status bits indicate the

function of the current bus cycle. These signals are normally decoded by the 8288 bus controller.

:0 and ,1,2 SSS

Page 40: A Historical Background

BBM 3622- Microprocessors 40

Satatus bits

S2 S1 S0 Function

0 0 0 Interrupt acknowledge

0 0 1 I/O read

0 1 0 I/O write

0 1 1 Halt

1 0 0 Opcode fetch

1 0 1 Memory read

1 1 0 Memory write

1 1 1 Passive

:0 and ,1,2 SSS

Page 41: A Historical Background

BBM 3622- Microprocessors 41

MAXIMUMMODE

MINIMUMMODE

1 40

20 21

8088

GND

A14

A13

A12

A11

A10

A9

A8

AD7

AD6

AD5

AD4

AD3

AD2

AD1

AD0

NMI

INTR

CLK

GND

Vcc

A15

A16,S3

A17,S4

A18,S5

A19,S6

MN,/MX

/RD

/RQ,/GT0

/LOCK

/S2

/S1

/S0

QS0

QS1

/TEST

READY

RESET

/RQ,/GT1

HOLD

/WR

IO/M

DT/R

/DEN

ALE

/INTA

HLDA

high /SS0

MAXIMUMMODE

MINIMUMMODE

1 40

20 21

8086

GND

AD14

AD13

AD12

AD11

AD10

AD9

AD8

AD7

AD6

AD5

AD4

AD3

AD2

AD1

AD0

NMI

INTR

CLK

GND

Vcc

AD15

A16,S3

A17,S4

A18,S5

A19,S6

/BHE,S7

MN,/MX

/RD

/RQ,/GT0

/LOCK

/S2

/S1

/S0

QS0

QS1

/TEST

READY

RESET

/RQ,/GT1

HOLD

/WR

IO/M

DT/R

/DEN

ALE

/INTA

HLDA

Page 42: A Historical Background

BBM 3622- Microprocessors 42

8086/8 In Circuit (2)

In Maximum Mode the 8086/8 needs at least the following: 8288 Bus Controller, 8284A Clock Generator, 74HC373s and 74HC245s

With the aid of these devices the 8086 begins to look like the ideal microprocessor we looked at earlier

Page 43: A Historical Background

74LS245x2

8284AClock

Generator

RDY

Vcc

8086CPU

CLK

READY

RESET

MN/MX#

S0#S1#S2#

8288Bus

Controller

MRDC#

MWTC#

AMWC#

IORC#

IOWC#

AIOWC#

INTA#

CLK

74LS373x3

ADDR/DATA

LEOE#

ALE

DENDT/R#

BHE#

AD15:AD0

A19:A16

74LS245x2

EN#DIR

D15:D0

A19:A0,BHE#

ADDR/Data

INTR

i8086 Circuit - Maximum Mode

Page 44: A Historical Background

BBM 3622- Microprocessors 44

8086/8 Maximum Mode

In maximum mode, the 8288 uses a set of status signals (S0, S1, S2) to rebuild the normal bus control signals of the microprocessor MRDC#, MWTC#, IORC#, IOWC# etc Equivalent to MEMR# etc

Look at some special signals briefly

Page 45: A Historical Background

BBM 3622- Microprocessors 45

74LS373 Octal Transparent Latch with 3-state Outputs

Page 46: A Historical Background

BBM 3622- Microprocessors 46

74LS245 Octal Bus Tranceiver

Page 47: A Historical Background

BBM 3622- Microprocessors 47

RESET# Signal

The Active low RESET# signal puts the 8086/8 into a defined state

Clears the flags register, segment registers etc.

Sets the effective program address to 0FFFF0h (CS=0F000h, IP=0FFF0h)

8086/8 Programs always start at FFFF0H after Reset has been asserted and removed

Continues into latest generation CPUs

Page 48: A Historical Background

BBM 3622- Microprocessors 48

BHE# Signal (8086 Only)

The 8086 processor can address memory a byte at a time

Its data bus is 16-bits wide It uses the BHE# signal and A0

(sometimes called BLE#) to address bytes using its 16-bit bus

Page 49: A Historical Background

BBM 3622- Microprocessors 49

Use of BHE#/A0(BLE#)

FFFFF

FFFFD

FFFFB

FFFF9

00005

00003

00001

ODD Addresses (8086)

FFFFE

FFFFC

FFFFA

FFFF8

00004

00002

00000

EVEN Addresses (8086)

A19..A1 A19..A1

D15:D8 D7:D0

FFFFF

FFFFE

FFFFD

FFFFC

00002

00001

00000

Byte-Wide addressing(8088)

BHE# A0/BLE#

Page 50: A Historical Background

BBM 3622- Microprocessors 50

Use of BHE#/BLE#

BHE# A0/BLE# Selection

0 0 Whole word (16-bits)

0 1 High byte to/from odd address

1 0 Low byte to/from even address

1 1 No selection

Page 51: A Historical Background

BBM 3622- Microprocessors 51

ALE and Address/data Bus Multiplexing

8086/8 Multiplexes the Address and Data signals onto the same set of pins

Need off-chip logic to separate the signals

Transparent latches designed just for address demultiplexing

Page 52: A Historical Background

BBM 3622- Microprocessors 52

ALE and 74HC373 Transparent Latch

AddressTime

Clock

Address/DataBus

Data Time

ALE

Output of74HC373

Microcomputer AddressBus

LE

OE#

ALE

Address/Data Bus

System Address BusIn0:In7 Q0:Q7

74HC373 or equivalent

TriState Control signal,OE#, shown connected to

GND for simplicity

Page 53: A Historical Background

BBM 3622- Microprocessors 53

Use of ALE (Address Latch Enable)

ALE is used with an external latch (74HC373) to demultiplex the address and data lines

74HC373 is transparent when its LE input (connected to ALE) is high

When ALE goes low, the ‘373 holds the last data until ALE goes high again

Page 54: A Historical Background

BBM 3622- Microprocessors 54

8288 Bus Controller and Bus Transceivers

8288Bus Controller

DIR

DEN#DT/R#

74HC245

EN#

74HC245

EN#

DIR

DIR

CPU [D15:D8]

CPU [D7:D0]

Buffered [D15:D8]

Buffered [D7:D0] To M

emor

y an

d I/

O

Sys

tem

s

8288 Bus Controller alsogenerates Direction and

Enable signals for Bi-Directional Transeivers

Supports Buffering theSystem Data Bus

Page 55: A Historical Background

BBM 3622- Microprocessors 55

8086 Read CycleT1 T2 T3 T4

Address Status

001 or 101

Address Valid Datafloat float

Valid Address

CLK

/S0, /S1, /S2

A16..A19, /BHE

ALE

AD0..AD15

A0..A19

S3..S6

DT/R

DEN

/MRDC or /IORC

Page 56: A Historical Background

BBM 3622- Microprocessors 56

8086 Write CycleT1 T2 T3 T4

Address Status

010 or 110

Address Valid Data

Valid Address

CLK

/S0, /S1, /S2

A16..A19, /BHE

ALE

AD0..AD15

A0..A19

S3..S6

DT/R

DEN

/MWTC or /IOWC

Page 57: A Historical Background

BBM 3622- Microprocessors 57

8086 Read Cycle (1 Wait State)T1 T2 T3 Tw

Address Status

001 or 101

Address Valid Datafloat float

Valid Address

CLK

/S0, /S1, /S2

A16..A19, /BHE

ALE

AD0..AD15

A0..A19

S3..S6

DT/R

DEN

/MRDC or /IORC

T4

8284 RDY

READY

Page 58: A Historical Background

BBM 3622- Microprocessors 58

8086/8088 Summary

First Generation (introduced June 1978) One of the first 16-bit processors on the

market 16-bit internal registers 16/8-bit external data bus 20-bit address bus (1MB addressable) Used in 1st generation IBM PCs (1981)

Page 59: A Historical Background

BBM 3622- Microprocessors 59

80186/80188

Evolution of 8086/8088 80186/80188

Increased instruction set On-chip system components (Clock

generator, DMA, Interrupt, Timers…) Unsuccessful in PCs Popular in embedded systems…

Page 60: A Historical Background

BBM 3622- Microprocessors 60

2nd Generation Processor 286

P2 (286) = 2nd Generation Processor Introduced in 1981 CPU behind IBM AT Throughput of original IBM AT (6MHz) was

about 500% of IBM PC (4.77MHz) Level of integration: 134k transistors (vs 29k

in 8086) Still a 16-bit processor… Available in higher clock frequencies: 25MHz

Page 61: A Historical Background

BBM 3622- Microprocessors 61

2nd Generation Processors 286

Fully backwards compatible to 808680286 runs 8086 software without modification

Improved instruction executionAverage instruction takes 4.5 cycles vs. 12 cycles (8086)

Improved instruction set Real mode and Protected Mode

Multitasking-support. What happens in one area of memory doesn’t affect other programs. Protected mode supported by Windows 3.0.

16MB addressable physical memory On-chip MMU (1GB virtual memory) Non-multiplexed address-bus and data-bus

Page 62: A Historical Background

BBM 3622- Microprocessors 62

Improving Computer Performance

We’ve seen how 16-bit computer technology based on the 8086 and 80286 processors developed

These computers are not powerful enough for today’s applications

How do you improve the performance of your computer?

Let’s start with the CPU

Page 63: A Historical Background

BBM 3622- Microprocessors 63

CPU Performance (1)

MOST OBVIOUS: Processor Clock Frequency

Increased frequency – increased execution rate

State of the Art: >2GHz (Jan 2002) Memory and I/O access times can be

performance bottleneck – unless you take some special measures

Page 64: A Historical Background

BBM 3622- Microprocessors 64

CPU Performance (2)

ALU register width A processor is an n-bit processor, where N represents

the precision of the ALU – N can be 4, 8, 16, 32, or 64 The wider the registers – the more processing per

clock Data bus width

The wider the data bus the faster we can transfer data Since the memory and I/O device access times are

finite, the more bits transferred per cycle the better

Page 65: A Historical Background

BBM 3622- Microprocessors 65

CPU Performance (3)

Address bus width Increased address width doesn’t provide a

‘speed’ increase as such CPU can directly address more memory PCs use big programs, which would not fit in a

smaller address space Overcoming small address space takes time

Impacts on overall system performance

Page 66: A Historical Background

BBM 3622- Microprocessors 66

3rd Generation Processor 386

P3 (386) = 3rd Generation Processor Introduced: 10/1985 Full 32-bit processor

(32-bit registers. 32-bit internal and external databus. 32-bit address bus)

275k transistors. CMOS. 132-pin PGA package.(Supply current Icc=400mA. Roughly the same as 8086 !)

Clock speeds: 16-33MHz P3 processors were far ahead of their time:

It took 10 years before 32-bit operating systems became mainstream! First 386 PCs early 1987

(COMPAQ)

Page 67: A Historical Background

BBM 3622- Microprocessors 67

3rd Generation Processor 386

Modes of operation: Real. Protected. Virtual Real.

Protected mode of 386 is fully compatible with 286Protected mode=native mode of operation. Chips are designed for advanced operating systems such as Windows NT

New virtual real modeProcessor can run with hardware memory protection while simulating the 8086’s real-mode operation. Multiple copies of e.g. DOS can run simultaneously, each in a protected area of memory. If a program in one memory area crashes, the rest of the system is protected.

Page 68: A Historical Background

Intel 32-bit Architecture:IA-32

Addressing Unit(AU)

Bus Unit (BU)

Instruction Unit (IU)

Prefetch Queue

Registers

ControlUnit (CU)

ALU

Execution Unit (EU)

Address

Data

The 80386 includes a Bus Interface Unit for reading and providing data and instructions,witha Prefetch Queue, an IU for controlling the EU with its registers, as well as an AU forgenerating memory and I/O addresses

Page 69: A Historical Background

BBM 3622- Microprocessors 69

80386 Features

32-bit general and offset registers 16-byte prefetch queue Memory management unit with segmentation unit and

paging unit 32-bit address and data bus 4-Gbyte physical address space 64-Tbyte virtual address space i387 numerical coprocessor Implementation of real, protected and virtual 8086 modes

Page 70: A Historical Background

BBM 3622- Microprocessors 70

80386 Operating Modes

Protected Mode for Multitasking support Real Mode (native 8086 mode)

Processor powers up in Real Mode System Management Mode

Power management or system security Processor switches to separate address space, while

saving the entire context of the currently running program or task

Page 71: A Historical Background

80386 Register Set

EIP IP

31 16 15 0

FLAG

31 16 15 E0

EFLAG

AH

31 16 15 0

AL

78

BH BL

EAX

EBX

CH CLECX

DH DLEDX

SIESI

DIEDI

BPEBP

SPESP

Instruction Pointer EFLAG Register

General-Purpose RegistersSegment Registers

15 0

CS

SS

DS

ES

FS

GS

Page 72: A Historical Background

BBM 3622- Microprocessors 72

80386 Prefetch Queue

Execution Unit Bus Interface Unit16-byte deep

Instruction Queue32-bit Data

Bus

Fetching from on-chip Queue is fast

Reading from off-chip Memory is slow

Page 73: A Historical Background

BBM 3622- Microprocessors 73

80386 Prefetch Queue

80386 Prefetch queue is 16-bytes deep

1. The instruction fetch can read from the prefetch queue faster than from memory

2. The prefetcher can do some work while the execution unit is doing other tasks in parallel

Page 74: A Historical Background

BBM 3622- Microprocessors 74

Coprocessor: i387

The hardware implementation of floating point processing in the i387 means floating point operations run at much higher speed.

The i386 can execute all mathematical expressions using software emulation of the i387.

Page 75: A Historical Background

BBM 3622- Microprocessors 75

80386: Classic CISC Processor

CISC = Complex Instruction Set Computer Complex instructions ...but code-size efficient Micro-encoding of the machine instructions Extensive addressing capabilities for

memory operations Few, but very useful CPU registers

Page 76: A Historical Background

BBM 3622- Microprocessors 76

80386 Execution Sequence

Bus

Inte

rface

Pref

etch

Que

ue

Dec

odin

g U

nit

Control Unit

MicrocodeROM

MicrocodeQueue

Exec

utio

n U

nit

Register

Register

Register

Register

ALU

CoprocessorCISC Processor

In a microprogrammed CISC the processor fetches the instructions via the bus interface into aprefetch queue, which transfers them to a decoding unit. The decoding unit breaks the machineinstruction into many elementary micro-instructions and apples them to a microcode queue. Themicro-instructions are transferred from the microcode queue to the control and execution unit whichdrives the ALU and the registers

Page 77: A Historical Background

BBM 3622- Microprocessors 77

80386 Complex Instructions

CISC drawback: Most instructions are so complicated, they have to be broken into a sequence of micro-steps

These steps are called Micro-Code Stored in a ROM in the processor core Micro-code ROM: Access-time and size... They require extra ROM and decode logic

Page 78: A Historical Background

BBM 3622- Microprocessors 78

RISC: Less is More

RISC = Reduced Instruction Set Computer 20/80 Rule: 20% of the instructions

take up 80% of the time Sometimes executing a sequence of

simple instructions runs quicker than a single complex machine instruction that has the same effect

Page 79: A Historical Background

BBM 3622- Microprocessors 79

RISC Ideas (1)

Reduce the instruction set to simplify the decoding Smaller Instruction Set -> Simpler Logic ->

Smaller Logic -> Faster Execution Eliminate microcode – hardwire all

instruction execution Pipeline instruction decoding and

executing – do more operations in parallel

Page 80: A Historical Background

BBM 3622- Microprocessors 80

RISC Ideas (2)

Load/Store Architecture – only the load and store instructions can access memory All other instructions work with the

processor internal registers This is necessary for single-cycle execution

– the execution unit can’t wait for data to be read/written

Page 81: A Historical Background

BBM 3622- Microprocessors 81

RISC Ideas (3)

Increase number of internal register due to Load/Store Architecture

Also registers are more general purpose and less associated with specific functions

Compiler designed along with the RISC processor deesign. Compiler has to be aware of the processor architecture to produce code that can be executed efficiently

Page 82: A Historical Background

BBM 3622- Microprocessors 82

Instruction Pipelining - Operations Can Be Carried Out in Parallel

Read the instruction from memory or the prefetch queue (instruction fetch phase)

Decode the instruction (decode phase) Where necessary, fetch the operands

(operand fetch phase) Execute the instruction (execute phase) Write back the result (write-back phase)

Page 83: A Historical Background

Pipelined Execution

Inst

ruct

ion

Fetc

h

Dec

ode

Ope

rand

Fet

ch

Exe

cutio

n

Writ

e-ba

ck

Instructionk

Instructionk-1

Instructionk-2

Instructionk-3

Instructionk-4

Instructionk+1

Instructionk

Instructionk-1

Instructionk-2

Instructionk-3

Instructionk+2

Instructionk+1

Instructionk

Instructionk-1

Instructionk-2

Instructionk+4

Instructionk+3

Instructionk+2

Instructionk+1

Instructionk

Instructionk+3

Instructionk+2

Instructionk+1

Instructionk

Instructionk-1

Cycle n

Cycle n+1

Cycle n+2

Cycle n+3

Cycle n+4

Result k-4

Result k-3

Result k-2

Result k-1

Result k

Page 84: A Historical Background

BBM 3622- Microprocessors 84

Superscalar Architecture:

The processor may have more than one pipeline (Pentium…)

Where possible each pipeline works independently Not always possible

May achieve average completed execution of more more than one instruction per clock cycle

Page 85: A Historical Background

BBM 3622- Microprocessors 85

Pipelining problems

More logic per pipeline stage – same resource can’t be used twice E.g. can’t re-use ALU for computing

implied addresses Synchronisation Problems Delayed Jump/Branch Data and Register dependency, e.g.ADD reg1, reg2, reg7AND reg6, reg1, reg3

Page 86: A Historical Background

BBM 3622- Microprocessors 86

Getting the Benefits of Pipelining

Simplified Instruction decoding Simpler, faster logic

On-chip cache memories Local memory on-chip to avoid memory

access bottlenecks Floating Point pipeline for FP coprocessor Speculative Execution to get around

pipeline flushes

Page 87: A Historical Background

BBM 3622- Microprocessors 87

Software Implications of RISCs

Optimising Compiler must know how pipeline works(Compiler must be aware of pipeline delays, and insert NOPs if need be)

Lower code density in RISC because instructions are less efficient PowerPC code takes up to 30% more

code to do the same tasks as an x86 CPU

more memory accesses, potential performance impact...

Page 88: A Historical Background

BBM 3622- Microprocessors 88

80486: IA-32 with RISC elements

Introduced 04/91 Greatly improved 80386 CPU Hard-wired implementation of frequently used instructions

(as in RISCs). On average 2 clock cycles/instruction. 5 stage instruction pipeline Internal L1 Cache Memory (8kB) + cache controller On-chip Floating Point coprocessor (FPU) Longer Prefetch Queue (32-bytes as opposed to 16 on the

80386) Higher frequency operation: up to 120MHz >1.2M transistors, 0.8m CMOS. 168-pin PGA.

Page 89: A Historical Background

BBM 3622- Microprocessors 89

80486 Block Diagram

Bus

Int

erfa

ce

Cache(8K

bytes)P

refe

tche

r

(32-

byte

queu

e)

PagingUnit

Dec

odin

g

Uni

t

Se

gm

en

tatio

nU

nit

ControlUnit

Registerand ALU

FloatingPoint Unit

A31-A0

D31-D0

Control andStatus Signals

i486 CPU

Page 90: A Historical Background

BBM 3622- Microprocessors 90

80486 Pipeline

Inst

ruct

ion F

etc

h

Deco

de 1

(mem

ory

acc

ess

)

Deco

de 2

Exe

cutio

n

Write

-back

ADD eax,mem32

Decode ADD,fetch mem32

Decode ADD(continued)

Write resultinto eax

Add eax andmem32

Cycle n

Cycle n+1

Cycle n+2

Cycle n+3

Cycle n+4