Upload
karvind08
View
192
Download
0
Embed Size (px)
Citation preview
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
INPUT/OUTPUT (I/O) SUBSYSTEMS
• Overview of I/O performance measurement and analysis
• Processor interface issues
• Buses
• Types and characteristics of I/O devices
. Hard disk storage
. Network interfaces
• I/O system design
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
MOTIVATION FOR STUDYING I/O
• CPU performance improves by 50% to 100% per year
• I/O systems’ performance improvements are limited by physics(in some cases)
. Mechanical delays (disk drives):Latency improvement is of order 5% per year
. Electrical and optical phenomena (dispersion, attenuation, crosstalk):Improvement is 5% to 25% per year
• Amdahl’s law implies that, sooner or later, most of the latency willbe due to the part that is hardest to improve
. Given: 10% of instructions perform I/O, CPU is 10 x faster
. Improvement is only 5 x ⇒ lose 50% of improvement
• I/O bottleneck lowers the value of CPU improvements
. As technology evolves, a diminishing fraction of total latency is dueto the CPU
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
I/O PERFORMANCE METRICS
•Bandwidth (bits or bytes per second):
. Peak
. Sustained
. Useful for buses and networks
• Throughput (I/O processes per second)
. Useful for file serving and transaction processing
• Latency = total time for an I/O process from start to finish
. Most important to users◦ Latency too great ⇒ user loses train of thought◦ Latency
= controller time + wait time + no. bytesbandwidth + CPU time
− overlap
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
PROCESSOR INTERFACE ISSUES
• Interconnections
. Buses
• Processor interface
. Interrupts
. Memory-mapped I/O
• I/O control structures
. Polling
. Interrupts
. DMA
. I/O controllers
. I/O processors
• Capacity, access time, bandwidth, cost
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
BUSES
•Bus: A communication link shared by multiple subsystems
. Physically: Parallel conductors (traces on die or PC board; cable)
. Advantages:◦ Low cost (compared to point-to-point wiring)◦ Versatility of interconnections
. Disadvantages:◦ Electrical problems ⇒ short length¦ Bus skew¦ Dispersion¦ Crosstalk◦ Shared resource ⇒ contention
. Organization:◦ Control lines to signal & acknowledge requests◦ Data lines to carry addresses, data or commands
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
PROCESSOR–I/O INTERFACE BUS TYPES
• Backplane bus
. Processor, memory and I/O devices coexist on the same bus
. In olden times, often built into the backplane of a computer◦ An interconnection structure that was part of the chassis
. Processor architecture includes explicit I/O instructions (IN, OUT)
. Standard backplane buses: VMEbus, Multibus, NuBus, PCI,ISA (Industry Standard Architecture) bus
• I/O bus
. Examples: IDE, SCSI
Processor MemoryBackplane bus
a. Processor, memoryand I/O devices on thesame bus
I/O devices
Processor MemoryBackplane bus
b. Processor andmemory are on a backplane bus; bus adapters provideinterfaces for variousI/O buses
Busadapter
Busadapter
I/Obus
I/Obus
Busadapter
I/Obus
Processor MemoryProcessor-memory bus
c. Processor and memory are on afast synchronousbus
A bus adapter interfaces the processor-memorybus to the backplane bus
Busadapter
Backplanebus
Busadapter
I/O bus
Busadapter
I/O bus
I/O SYSTEM USING ONLY A BACKPLANE BUS
Mainmemory
I/Ocontroller
I/Ocontroller
I/Ocontroller
Disk Graphicsoutput
Network
Memory–I/O bus
Processor
Cache
Interrupts
Disk
I/O SYSTEM USING AN I/O BUS
Cache
I/O bus
I/Ocontroller
Disk Disk Graphicsoutput
Network
I/Ocontroller
I/Ocontroller
CPU-memory bus
CPU
Busadapter Main
memory
MACINTOSH 72xx I/O SYSTEM
Mainmemory
I/Ocontroller
I/Ocontroller
Graphicsoutput
PCI
CDROM
Disk
Tape
I/Ocontroller
Stereo
I/Ocontroller
Serialports
I/Ocontroller
Appledesktop bus
Processor
PCIinterface/memory controller
EthernetSCSI bus
outputinput
BACKPLANEBUS
I/OCONTROLLERS
AND BUSADAPTERS
PENTIUM II I/O SYSTEM
ISAbridge
Modem
Mouse
PCIbridgeCPU
Mainmemory
SCSI USB
Local bus
Soundcard Printer Available
ISA slot
ISA bus
IDEdisk
AvailablePCI slot
Key-board
Mon-itor
Graphicsadaptor
Level 2cache
Cache bus Memory bus
PCI bus
BACKPLANEBUS
I/OCONTROLLERS
AND BUSADAPTERS
Tanenbaum, Structured Computer Organization
Gigaplane-XBIncludes: XB-Interconnect, 4 address buses, bulk power distribution
Local Power Converters
Enterprise 10000 hardware architecture
• Data is packet-switched using a crossbar• Addresses are broadcast
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
3-STATE BUFFER
• A 3-state buffer has 2 inputs and 1 output
. Enable asserted: Output = input (state is either 0 or 1)
. Enable deasserted: High-impedance state (denoted × or Z)◦ Output can be driven by another device
. Equivalent to a mechanical switch
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
EXCITATION TABLE FOR 3-STATE BUFFER
• A tristate buffer has 3 possible output values:
. Asserted
. Deasserted
. High impedance (floating)
enable in out0 0 Z0 1 Z1 0 01 1 1
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
USE OF TRISTATES TO ENABLE/DISABLE BUS ACCESS
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
BUS DESIGN CONSTRAINTS
• Laws of physics limit bus speeds
. Transmission speed ≤ speed of light
. Crosstalk◦ Occurs because:¦ A time-varying voltage on a conductor induces a chargeq2 = C12 v1 on another, parallel conductor¦ A time-varying current in a conductor induces a voltagev2 = L12 di1/dt in another, parallel conductor
◦ Limits bus clock frequency◦ Can be reduced by:¦ Grounding alternate conductors¦ Abandoning the bus concept and using twisted-pair, point-to-
point connections (Seymour Cray)◦ EMI & reflections limit number of devices connected to bus
• Real estate on die or PC board limits number of lines
COMPLEX ULTRA-SCSI CHAIN
7.62 30.48 30.48 10.16 10.16 10.16 10.16 10.16210.82 TERMINATOR
FIVE 7.37-CM STUBS,25 pF EACH
THREE 12.45-CM STUBS,25 pF EACH
TERMINATOR
3 METERS (10 FEET) OVERALL LENGTH(INDIVIDUAL MEASUREMENTS IN CENTIMETERS)
DEVICEPOSITION
7 6 5 4 3 2 1 0DRIVER
ACK SIGNALS ON COMPLEX ULTRA-SCSI CHAIN
DEVICEPOSITION
2 V
OLT
S P
ER
DIV
ISIO
N
0
6
2
4
4
0
2
4
0
2
4
0
2
4
0
2
4
0
2
7
6
5
4
0
DRIVER INPUT
DRIVER OUTPUT @ 7
DRIVER INPUT @ 6
DRIVER INPUT @ 5
DRIVER INPUT @ 4
DRIVER INPUT @ 0
LOGIC SIGNALDRIVING SCSIDRIVER
ACK SIGNALS
10 NANOSECONDS PER DIVISION
ACK SIGNALS ON POINT-TO-POINT ULTRA-SCSI BUS
2 V
OLT
S P
ER
DIV
ISIO
N
6
4
2
0
4
2
0
4
2
0
DRIVER INPUT
DRIVER OUTPUT
RECEIVER INPUT AFTER 25 M
10 NANOSECONDS PER DIVISION
25 METERS (82 FEET) OVERALL LENGTH
TERMINATOR TERMINATOR
DRIVER RECEIVERONLY END LOADS FOR THIS TESTSHIELDED 34-PAIR EXTERNAL CABLE
ACK SIGNAL
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
ASYNCHRONOUS vs. SYNCHRONOUS BUSES
• Bus communication protocol: Specification of sequence of events andtiming requirements for transferring information on a bus
• Asynchronous bus transfers:
. Certain conductors on the bus are control lines
. Signals on the control lines control the sequence of events
• Synchronous bus transfers:
. Events are sequenced relative to a master clock signal
. Once a certain kind of transfer has been initiated, no furthercommand signaling is necessary to control the transfer
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
SYNCHRONOUS BUSES
• Bus clock is phase-locked to processor clock
. Bus clock frequency = 1n× processor clock frequency (n = 1 to 6)
. Clock signal is carried on a control line
. Communications protocol defined with reference to bus clock signal
. Local bus (e.g., VESA Local Bus):◦ Extends the processor’s bus control signals◦May connect processor to L2 cache◦May connect processor and memory to high-speed I/O devices
• Advantages:
. Fast & wide
. Simple logic (finite state machine)
• Disadvantages:
. Must be short (bus skew; attenuation; crosstalk)
. All devices must run at same frequency
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
80286 – PENTIUM I/O
• Separate I/O and memory address spaces
. Since the 8086, I/O or memory access is signaled by M/IO#(memory access if high, I/O if low)◦ For MOVE (memory–CPU copy), M/IO# is high◦ For IN or OUT (I/O), M/IO# is low◦ M/IO# is a processor signal that does not appear on the ISA bus◦ Instead, M/IO# is an input to the bus controller
. I/O address space is 0x0000 to 0xffff
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
80286 SIGNALS
CLK
8 0 2 8 6
A15A14A13A12A11A10
A 9A 8A 7A 6A 5A 4A 3A 2A 1A 0
COD/INTA/M / I O /
BHE
S1S0
HLDA
PEREQ
INTR
RESET
READY
BUSY
HOLD
NMI
LOCK
CAP
A16A17A18A19A20A21A22A23
D0D1D2D3D4D5D6D7
PEACK
ERROR
D8D9D10D11D12D13D14D15
3 1
5 14 94 74 54 34 13 93 75 04 84 64 44 24 03 83 6
6 36 45 75 96 1
5 45 3
2 9
781 01 11 21 31 41 51 61 71 81 92 02 12 22 32 42 52 62 72 83 23 33 4
16 76 66 86 5645
52
60
62
Upper databus transceiver
Lower databus transceiver
Addresslatch
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
ISA BUS
• ISA ≡ Industry Standard Architecture
. Synchronous
. Industry response to IBM’s MicroChannel architecture
. Uses both the PC/AT and the IBM PC bus standards◦ Interface cards have 2 sets of connectors◦ PC bus: 8 data lines, 20 address lines◦ ISA bus: 16 data lines, 24 address lines; bus frequency 8.33 MHz
Maximum possible throughput: 2 bytes×8.33 MHz = 16.67 MB/s. Separate I/O and memory address spaces◦ Since the 8085, I/O or memory access is signaled by IO/M#
(I/O if high, memory access if low)¦ For MOVE (memory–CPU copy), IO/M# is high¦ For IN or OUT (I/O), IO/M# is low◦ I/O address space is 0x0000 to 0xffff
ISA BUS CONNECTORS
Motherboard PC busPC bus
connectors ContactPlug-inboard
Chips
New connector for PC/AT Edge connector
CPU andotherchips
Tanenbaum, Structured Computer Organization
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
PCI BUS
• PCI ≡ Peripheral Component Interconnect
. Synchronous
. PCI 1.0: Clock frequency 33 MHz, 32-bit-wide data path
. PCI 2.1: Clock frequency 66 MHz, 64-bit-wide data path◦Maximum theoretical bandwidth:
8 bytes× 66 MHz = 528 MB/s. Transactions are negative-edge-triggered. Address and data lines are multiplexed. Bus arbiter usually built into the chipset. Every PCI device has a 256-byte configuration address space that
is readable by other devices ⇒ Plug ’n Play
• PCI cards
. Options include voltage (5 V vs. 3.3 V), width (32 bits/120 pins vs.64 bits/184 pins) and frequency (33 vs. 66 MHz)
PCI BUS ARBITER
PCIarbiter
PCIdevice
RE
Q#
GN
T#
PCIdevice
RE
Q#
GN
T#
PCIdevice
RE
Q#
GN
T#
PCIdevice
RE
Q#
GN
T#
Tanenbaum, Structured Computer Organization
PCI BUS TIMING FOR READ AND WRITE CYCLES
Φ
T1 T2 T3 T4 T5 T6 T7
Turnaround
Address AddressData Data
Read Idle
Bus cycle
White
AD
C/BE#
FRAME#
IRDY#
DEVSEL#
TRDY#
Read cmd Wr ite cmdEnable Enable
Tanenbaum, Structured Computer Organization
PCI BUS SIGNALS
Signal Lines Master Slave DescriptionCLK 1 Clock (33 MHz or 66 MHz)AD 32 × × Multiplexed address and data linesPAR 1 × Address or data parity bitC/BE 4 × Bus command/bit map for bytes enabledFRAME# 1 × Indicates that AD and C/BE are assertedIRDY# 1 × Read: master will accept; write: data presentIDSEL 1 × Select configuration space instead of memoryDEVSEL# 1 × Slave has decoded its address and is listeningTRDY# 1 × Read: data present; write: slave will acceptSTOP# 1 × Slave wants to stop transaction immediatelyPERR# 1 Data parity error detected by receiverSERR# 1 Address parity error or system error detectedREQ# 1 Bus arbitration: request for bus ownershipGNT# 1 Bus arbitration: grant of bus ownershipRST# 1 Reset the system and all devices
Sign Lines Master Slave DescriptionREQ64# 1 × Request to run a 64-bit transactionACK64# 1 × Permission is granted for a 64-bit transactionAD 32 × Additional 32 bits of address or dataPAR64 1 × Parity for the extra 32 address/data bitsC/BE# 4 × Additional 4 bits for byte enablesLOCK 1 × Lock the bus to allow multiple transactionsSBO# 1 Hit on a remote cache (for a multiprocessor)SDONE 1 Snooping done (for a multiprocessor)INTx 4 Request an interruptJTAG 5 IEEE 1149.1 JTAG test signalsM66EN 1 Wired to power or ground (66 MHz or 33 MHz)
MANDATORY PCI BUS SIGNALS
OPTIONAL PCI BUS SIGNALS
Tanenbaum, Structured Computer Organization
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
OBTAINING BUS ACCESS
• Goal: Give every device fair access
•Method: Use bus masters
. A master enables bus access for one or more devices (by enabling/disablingtristate buffers)
. Single bus master can be a bottleneck
. Multiple masters require arbitration◦ Every device has a priority (IRQ number, SCSI ID, . . . )◦ Extra control lines needed for bus request/access
. Arbitration methods:◦ Centralized & parallel (SCSI)◦ Daisy chain (VMEbus)◦ Distributed arbitration using self-selection (NuBus)◦ Distributed arbitration using collision detection (Ethernet)
A BUS TRANSACTION WITH A SINGLE MASTER
Memory Processor
Bus request lines
Bus
Disks
Bus request lines
Bus
Disks
Processor
Bus request lines
Bus
Disks
a. Device generatesbus request
b. Master (processor)responds by generatingcontrol signals (for read,etc.)
c. Processor notifies I/Odevice that its request isbeing processed; devicethen puts address for therequest on the bus
ProcessorMemory
Memory
DAISY CHAIN
Device n
Lowest priority
Device 2Device 1
Highest priority
Busarbiter
Grant
Grant Grant
Release
Request
A daisy chain bus uses a bus grant line that chains through each devicefrom highest to lowest priority. The protocol is:1. Signal on the request line2. Wait for a low-to-high transition on the grant line (indicates reassignment)3. Intercept the grant signal and stop asserting the request line4. Use the bus5. Signal that the bus is no longer required by asserting the release line
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
ASYNCHRONOUS BUSES
• Not clocked
. Can accomodate many kinds of devices (disk, tape, scanner, . . . )
• Data transfer controlled with handshaking protocol on dedicatedcontrol lines; represent with a finite state machine for each device
• Example (SCSI-1 bus):
. Bus controller asserts Sel (select device) and transmits device ID
. Selected device responds with Ack
. Controller asserts Cmd (command), Msg (message), and Req (requesta data transfer) signals, then transmits command bytes
. Device responds to each byte with Ack
. Controller deasserts Cmd, asserts I/O, then transmits data bytes
. Device responds to each byte with Ack
STEPS OF AN ASYNCHRONOUS OUTPUT OPERATION
Memory Processor
Control lines
Data lines
Disks
Memory Processor
Control lines
Data lines
Disks
Processor
Control lines
Data lines
Disks
a. Initiation of a read operation from memory. Control lines: Read command; Data lines: Address
b. Memory access
c. Memory puts the data on the data lines of the bus and uses the control lines to signal the I/O device that the data is available
Memory
STEPS OF AN ASYNCHRONOUS INPUT OPERATION
Memory Processor
Control lines
Data lines
Disks
Processor
Control lines
Data lines
Disks
a. Control lines: Write request to memory; Data lines: Address
b. Memory signals the device that it is ready; Data is transferred
Memory
ASYNCHRONOUS BUS HANDSHAKING PROTOCOL
DataRdy
Ack
Data
ReadReq 13
4
57
642 2
1. When memory sees ReadReq asserted, it reads the address from the data bus and asserts Ack2. I/O device sees Ack asserted, releases ReadReq and data lines3. Memory sees ReadReq deasserted, drops Ack to acknowledge ReadReq4. Memory puts requested data on the data lines, asserts DataRdy5. I/O device sees DataRdy, reads data, signals that it has seen the data by asserting Ack6. Memory sees Ack, drops DataRdy, releases data lines7. I/O device sees DataRdy deasserted, drops Ack to signal end of transmission
I/O deviceMemory
1Record fromdata linesand assert
Ack
ReadReq
ReadReq________
ReadReq
ReadReq
3, 4Drop Ack;
put memorydata on datalines; assert
DataRdy
Ack
Ack
6Release data
lines andDataRdy
________
___
Memory
2Release data
lines; deassertReadReq
Ack
DataRdy
DataRdy
5Read memorydata from data
lines;assert Ack
DataRdy
DataRdy
7Deassert Ack
I/O device
Put addresson data
lines; assertReadReq
________
Ack___
________
New I/O request
New I/O request
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
SCSI-1: AN ASYNCHRONOUS BUS (1)
• SCSI := Small Computer System Interface
. Many “standard” implementations
. Can connect many different kinds of devices:◦ Logic board◦ Hard drive◦ CD-ROM drive◦ Tape drive◦ Scanner
. Controller chip on logic board or plug-in
. Controller is connected by cable to internal or peripheral devices
. Devices are daisy-chained
. Device ID is set by hardware switches
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
SCSI-1: AN ASYNCHRONOUS BUS (2)
• SCSI-1 bus configuration
. Peripheral SCSI-1 devices are connected by cable
. Each bit of a data byte is transferred on a separate wire (line) ofthe cable
. Each device must have a unique ID number between 0 and 7◦ The ID is signaled by asserting one of the lines DB(0) – DB(7)◦ In case of contention, the device with the highest ID wins◦ The logic board has ID 7, so it always wins
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
http://scitexdv.com/SCSI2/
SCSI ID BITS
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (02/1999)
SCSI-1: AN ASYNCHRONOUS BUS (3)
• SCSI signaling sequence for data transfer
. Controller broadcasts SEL (select) signal on pin 44 and the IDnumber on one of the data lines
. Device selected responds with ACK (acknowledge) signal on pin 48(handshake)
. Controller sends REQ (request) signal on pin 48 to order device toperform a task (such as transferring a data byte)
. Command bytes are transferred on the data bus
. A handshake must take place for each data byte transferred
SCSI Bus SignalsSignal Driven By Signal Explanation
DB0–DB7 Initiator/Target 8-Bit Bidirectional Data Bus.DBP Initiator/Target Data-Bus Parity Line. Optional.ATN Initiator Attention. Used to send a message to the target when it controls the bus.BSY Initiator/Target Busy. Indicates that the bus is unavailable for use.ACK Initiator Acknowledge. Used by the initiator for handshaking.RST Any Device Reset. Used to initiate a bus-free phase.MSG Target Driven by the target to indicate that the current transfer is a message.
SEL Initiator Select. Used by the initiator to select a target before command execution. Also used by the target to reconnect when the reselection phase is implemented.
C/D Target Control/Data. Used during the information transfer phases to transfer commands, sta-tus, data or messages over the bus.
REQ Target Request. Used by the target during information transfer phases.I/O Target Input/Output. Determines the direction of the transfer.
Phase Sequences of the SCSI Bus
ARBITRATION(OPTIONAL)
SELECTION
RESELECTION(OPTIONAL)
COMMAND
DATA
STATUS
MESSAGE(OPTIONAL)
BUS FREE
SCSI Information Transfer PhasesSignal
SEL BSY MSG C/D I/O Direction Phase
0 1 0 0 0 To Target Data Out
0 1 0 0 1 From Target Data In
0 1 0 1 0 To Target Command
0 1 0 1 1 From Target Status
0 1 1 0 0 — Reserved
0 1 1 0 1 — Reserved
0 1 1 1 0 To Target Message Out
0 1 1 1 1 From Target Message In
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
http://homebrew.cs.ubc.ca/415/project-submissions/group9/notes/scsi-2.html
SCSI BUS TOPOLOGY
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
I/O AS SYNCHRONIZATION OF DATA TRANSFERS
• Fundamental problems of communication between devices, or betweenthe CPU and peripheral devices:
. Detection that a data transfer is necessary◦ Dedicated polling◦ Interrupts◦ Periodic polling
. Synchronization of two devices, or a device and a CPU, withdifferent speeds◦Wait state insertion◦ DMA◦ Dual-ported memory◦ FIFO buffers◦ Caches
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
POLLED I/O
Is dataready?
noyes
Readdata;done?
noyes
A polling loopis not an efficientway to use a CPUunless the deviceis very fast. If the
device is fast,then “data ready”
checks can beinterspersed among usefulinstructions.
Wait
In most cases it is more efficient for the I/O
device to tell the CPUwhen data is ready,or when a transfer is
complete, than for theCPU to check the devicefrequently. An I/O devicecan use interrupts to tell
the CPU that a data transfer should be started, or is finished.
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
DEDICATED vs. PERIODIC POLLING
• Periodic polling means that the CPU periodically interrogates theI/O device (e.g., via an oscillator–counter–decoder combination) tosee whether data is ready
•Dedicated polling (spin waiting) means that the I/O device con-troller sets or clears bits in a status register that is read in a tightloop by the CPU
. When a system call for keyboard input is issued, and dedicatedpolling is in use, the CPU executes code somewhat like this:get_loop: lw $a0, Device_Status
bgez $a0, get_looplb $2, Device_Datarfe
. This operation transfers only a single byte; data may be missed
. A different approach is necessary for block transfers
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
INTERRUPT-DRIVEN I/O (1)
• An interrupt is an event that occurs outside the execution cycle andthat causes processing of the current thread to stop. Interrupts can be used to give I/O devices a means to signal the
CPU that an event has occurred that requires action by the CPU(data is ready, etc.)
. An interrupt causes an exception, which results in a jump to theappropriate exception handling code (MIPS: address 0x80000080)
. There are (at least) two principal methods for detecting interruptsin hardware:◦ Connect the interrupt request output of an I/O device to one of
the inputs of an interrupt controller¦ Interrupts may be level-triggered or edge-triggered◦ Connect one interrupt line to an OR of inputs from several devices
that are periodically strobed for data ready¦ Device that caused the interrupt can be detected by reading a
status word formed from inputs from the devices
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
INTERRUPT-DRIVEN I/O (2)
• On a RISC machine, an interrupt causes a jump to the general excep-tion handling code (with a few special cases such as Reset and UTLBMiss)
. Method of P&H Chapter 5: Execution is suspended immediately◦ This method is required for some exceptions (TLB miss, page
fault) unless execution can be undone◦ Restarting is hard in ISAs where memory is accessed at multiple
times during execution of an instruction. Method of choice: The instruction that caused the exception is
allowed to finish; subsequent instructions are suspended. Pending interrupts must be handled before next instruction is fetched. The exception handler determines the code to execute, based on
the Cause register contents. The operating system determines what state needs to be saved (if
any) besides the EPC and Cause registers
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
INTERRUPT-DRIVEN I/O (3)
•MIPS R2000 interrupt handler:
. Saves $a0 and $v0 in special locations◦ save0 is at address 0x90000250; save1 is at address 0x90000254◦ $a0 and $v0 can’t be pushed onto the stack, because the cause
of the exception may be a bad stack pointer!. Copies coprocessor 0 Cause and EPC registers into $k0 and $k1
. Pushes current Kernel/User mode and Interrupt Enable Mode bitsonto the stack in the Status register (see next slide)
. The kernel’s exception handler uses a jump table (or a sequenceof beq’s) to determine the right code to execute (see SPIM kerneltext)
. The operating system clears the interrupts, if any
. After executing an rfe instruction, the processor may restartexecution at the address in the EPC
15 8 5 4 3 2 1 0
Interruptmask
Old Previous Current
Kern
el/
user Inte
rrupt
enab
leKe
rnel
/us
er Kern
el/
userInte
rrupt
enab
le
Inte
rrupt
enab
le
MIPS R2000 STATUS REGISTER
Stack for kernel/user and interrupt enable bitslets processor respond to two levels of
exceptions before software must save theStatus register
BEV
TS PE CM PZ SwC
IsC
22 1631 28
CU
15 10 5 2
Pendinginterrupts
Exceptioncode
(ExcCode)
MIPS R2000 CAUSE REGISTER
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
EXCEPTION CODES IN THE MIPS R2000 ISA
ExcCode Name Description0 Int External interrupt1 MOD TLB modification exception2 TLBL TLB miss exception (Load or instruction fetch)3 TLBS TLB miss exception (Store)4 AdEL Address error exception (Load or instruction fetch)5 AdES Address error exception (Store)6 IBE Instruction fetch bus error exception7 DBE Data load or store bus error exception8 Sys System call exception9 Bp Breakpoint exception10 RI Reserved or undefined instruction exception11 CpU Coprocessor unusable exception12 Ovf Arithmetic overflow exception
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
SINGLE- AND MULTIPLE-LINE INTERRUPT SYSTEMS
CPU
Interruptflip-flop
SINGLE-LINE INTERRUPT SYSTEM
MULTIPLE-LINE INTERRUPT SYSTEM
CPU
Interruptregister 10 2 3 INTERRUPT REQUEST NUMBERS
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
VECTORED INTERRUPT SYSTEM
Priorityencoder
Interruptrequestlines
Interruptmaskregister
Interruptregister
Interruptnumber to CPU
Inputactive
Interruptpending
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
INTERRUPT-DRIVEN I/O (4)
• In the Motorola 68000 series, the CPU checks for pending interruptsafter execution of each instruction
. CPU saves status register (SR) and enters supervisor mode
. After determining the interrupt number N, the CPU saves stateinformation and executes M[4N]→ PC, causing a branch to the textat the location pointed to by M[4N]
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
VECTORED INTERRUPTS IN THE IBM PC
8259AInterruptcontroller
TOCPU
D0-D7
CSA0WR
INTA
RD
INT IRQ0IRQ1IRQ2IRQ3IRQ4IRQ5IRQ6IRQ7
+5 v
8259AInterruptcontroller
IRQ8IRQ9IRQ10IRQ11IRQ12IRQ13IRQ14IRQ15
INTINTA
D0-D7
CSA0WRRD
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
MEMORY-MAPPED I/O
• Instead of having multiple address spaces for memory, I/O, etc., havea single address space
. Loading from a memory location that is mapped to an I/O devicereads a data byte or word from the device
. Storing to a memory location that is mapped to an I/O devicewrites a data byte or word to the device
. Used in Motorola 68000 series
• In order to synchronize I/O properly, additional memory locationsmay be mapped to status words for the I/O devices
1
Interruptenable
Ready
1Unused
Receiver control(0xffff0000)
8
Received byte
Unused
Receiver data(0xffff0004)
1
Interruptenable
Ready
1Unused
Transmitter control(0xffff0008)
Transmitter data(0xffff000c)
8
Transmitted byte
Unused
SPIM’s MEMORY-MAPPED I/O REGISTERS
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
NETWORK INTERFACE CARD
TCLKTETXD
CDRXDCOL
COMMUNICATIONCONTROLLER
(FRAMING,BUS INTERFACE)
ETHERNETINTERFACEADAPTER
(SIGNALING)
BUS INTERFACE
JACK
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
I/O PROCESSORS
• An I/O processor (IOP) is a processor with (usually) a morerestricted instruction set than the CPU
. Purpose: Offload I/O processing from the CPU◦ Used in CDC 6600, IBM S/360–370, ...
. I/O instructions executed by an IOP are called channel commandwords in the IBM world
. A CPU and its IOPs are really a shared-memory multiprocessor
THE UNIVERSITY OF TEXAS AT DALLAS Erik Jonsson School of Engineeringand Computer Science
c© C. D. Cantrell (05/1999)
RELATION OF I/O TO PROCESSOR ARCHITECTURE
• I/O instructions and buses have disappeared
• Interrupt vectors have been replaced by jump tables
• Interrupt stack replaced by shadow registers
. Handler saves registers and re-enables higher-priority interrupts
• Interrupt types reduced in number
. Handler must query interrupt controller
• Caches cause problems for I/O
. Flushing degrades performance heavily
. Solution: “snooping” (borrowed from shared-memorymultiprocessors)
• Virtual memory frustrates DMA
• Load-store architecture inconsistent with atomic I/O operations
• Stateful processors hard to context switch