Upload
philip-shepherd
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Fall 2015, arz 1
CPE555A:Real-Time Embedded Systems
Lecture 4Ali Zaringhalam
Stevens Institute of Technology
CS555A – Real-Time Embedded Systems Stevens Institute of Technology
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 2
Outline
Procedure Calls I/O Exception Handling
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 3
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 3
ExampleLet’s compile the following procedure:
int leaf _example(int g, int h, int i, int j )
{
int f ;
f = (g+h) – (i+j );
return f ;
}
Now let’s try to compile this program using what we’ve learned sofar. To compile, the compiler must make a decision about how topass the arguments g, h, i and j to the procedure and where theprocedure must return its results to the caller.
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 4
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 4
Argument Passing & Return Value
The MI PS I SA does not specif y how arguments should be passed and values returned. This is done using a compiler/ assembler convention f or MI PS:
Registers 4-7 are used f or argument passing. By convention MI PS refers to these registers as $a0-$a3,
Registers 2-3 are used to return values f rom procedures. By convention MI PS refers to these registers as $v0-$v1.
Let’s assume in our example that the arguments g-i will be passed to the procedure in $a0-$a3 and the result is returned in $v0. The compiler chooses reg16 for the local variable f . I n addition the compiler uses reg8 and reg9 f or temporary storage of (g+h) & (i+j).
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 5
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 5
ExampleLet’s compile the f ollowing procedure:
int leaf _example(int g, int h, int i, int j )
{
int f ;
f = (g+h) – (i+j );
return f ;
}
Now let’s try to compile this program using what we’ve learned so f ar. To compile, the compiler must make a decision about how to pass the arguments g, h, i and j to the procedure and where the procedure must return its results to the caller.
$a0, $a1, $a2, $a3 = R4-R7
R16
$v0=R2
R8R9
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 6
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 6
What Could Go Wrong?
What if the caller also happens to be using the same registers reg8, reg9 and reg16? I n this case the values used by the caller will be overwritten by the callee and it will not be able to execute the program correctly af ter the callee returns. This problem cannot be addressed by simply using a diff erent set of registers in the caller and the callee. For one thing we don’t know how deep the nested procedure calls are. I f n procedure calls are nested, we will need
O n registers whereas we only have 32. For another this scheme
clearly will not work if the procedures are compiled separately and later linked (as in a library). The solution is to spill registers into main memory. We associate a f rame (or activation record) with a procedure call of the f unction 1 2 nf x ,x ,...x . The f rame will contain
registers that the callee plans to use during execution.
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 7
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 7
Call Stack The natural data structure for spilling
registers into memory is a call stack (a last-in first-out structure)
Register values are pushed and saved on the stack when the procedure is called and popped from the stack into the original register at return
Historically call stacks “grow” from High address to low address
A stack pointer is used to address the first unused memory location
MIPS software uses register 29 for stack pointer and refers to it as $sp
other machines (e.g., 80x86) may use a special-purpose stack pointer
main
Proc1
Proc2
Proc3
Proc4
Low Address
High Address
$sp
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 8
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 8
Carnegie Mellon
Pushing a Register on the Stack
Suppose the called procedure wants to use reg16 It must push register reg16 to save it
subi $sp, $sp, 4 Makes room for a 4-byte word on the
stack sw reg16, 0($sp)
Stores reg16 into stack memory Now the called procedure can use
reg16
-4
Stack GrowsDown
IncreasingAddresses
Stack “Bottom”
Stack Pointer: $sp
Stack “Top”
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 9
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 9
Stack Pointer: $sp
Stack GrowsDown
IncreasingAddresses
Stack “Top”
Stack “Bottom”
Carnegie Mellon
Popping a Register From the Stack
+4
Before the procedure returns, it must restore reg16 to original value Pops stack into register reg16
lw reg16, 0($sp) Loads reg16 from stack memory
addi $sp, $sp, 4 Pops the stack
Now the callee can use reg1 as before
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 10
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 10
Example - Continued
leaf_example: #this is the address in memory where the procedure is stored
#The compiler decides to use reg8, reg9 and reg16. Because the
#caller may also be using these same registers, they must be saved on the stack
subi $sp, $sp, 12 #Make room on the stack for three registers
sw reg8, 8($sp) #push reg8
sw reg9, 4($sp) #push reg9
sw reg16, 0($sp) #push reg16
add reg8, $a0, $a1 #compute g+h
add reg9, $a2,$a3 #compute i+j
sub reg16, reg8, reg9 #compute (g+h)- ( i+j)
add $v0, reg16, $zero #return result in $v0
#We are done. But before we return we must restore the registers that we decided to use
lw reg8, 8($sp) #restore reg8
lw reg9, 4($sp) #restore reg9
lw reg16, 0($sp) #restore reg16
addi $sp, $sp, 12 #pop the stack
j $ra #now return
Save current values on stack.
Now you can use/overwrite the registers for the procedure’s computations.
Restore old values to the registers.
Compiler assignmentsg $a0 R4h $a1 R5i $a2 R6j $a3 R7
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 11
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 11
$sn and $tm In the example the procedure saved and restored every register it
intended to use without knowing whether they were used by the caller. When too many registers are spilled, performance suffers.
The alternative is to define and adhere to a protocol where all procedures assume that certain registers need not be saved and restored across a procedure call. MIPS assembler conventions are:
10 registers (8-15 and 24-25) are designated as temporary registers that need not be preserved by the callee. They are referred to as $t0-$t9
if the caller uses $t0-$t9 it must save them before the call and restore them on return (caller-saved).
8 registers 16-23 are designated as saved registers that must be preserved by the callee. They are referred to as $s0-$s7
if the callee uses $s0-$s7 it must save them when the procedure is entered and restore them on return (callee saved). It doesn’t bother with $t0-$t9.
the caller will not save $s0-$s7, and the callee will not save $t0-$t9
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 12
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 12
Recompilation Using Register Spilling Rules
I n the example we do not need to save reg8 and reg9 but we do have to save reg16. leaf_example: #this is the address in memory where the procedure is stored
#The procedure plans to use reg8, reg9 and reg16. Following MI PS assembler conventions, the
#callee must only save reg16. This reduces register spilling improving code size and performance.
subi $sp, $sp, 4 #Make room on the stack for ONE register
sw $s0, 0($sp) #push $s0
add $t0, $a0, $a1 #compute g+h; we don’t need to save $t0 R8
add $t1, $a2,$a3 #compute i+j; we don’t need to save $t1 R9
sub $s0, $t0, $t1 #compute (g+h)- ( i+j)
add $v0, $s0, $zero #return result in $v0
#We are done. But before we return we must restore the registers that we decided to use
lw $s0, 0($sp) #restore $s0
addi $sp, $sp, 4 #pop the staclk
j $ra #now return
f $s0 R16
Compiler assignmentsg $a0 R4h $a1 R5i $a2 R6j $a3 R7
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 13
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 13
But There is More to Spill!
All is well if the called procedure is a leaf procedure. But when there is sequence of nested calls or a recursion we need to spill more registers. Consider compiling the f ollowing equation which computes n! recursively:
/ / f act
int f a
( 4) =
ct( int n)
{
i
4* f act( 3)
f (n< 1)
return 1;
else
ret
= 4*3* f act( 2)=4*3*2*
urn n* f act(
f act(
n- 1);
}
f act(3
1)
);
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 14
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 14
Compiled CodeSo let’s assume that the compiler decides to use $a0 to pass argument and $v0 for the returned value.
f act: / / no need to push saved registers on the stack. None is used.
slt $t0, $a0, 1 / / is n<1?
beq $t0, $zero, L1 / / if n>=1 go to L1
addi $v0, $zero, 1 / / return 1
j r $ra
L1: subi $a0, $a0, 1 / / n=n-1
jal f act / / compute (n-1)!; set $ra to return address
mul $v0, $v0, $a0
/ / compute n!; pretend there is a multiply instruction
j r $ra / / return to where? I nitial point of entry is lost and
/ / with what value of a0?
Now consider calling this f rom main() f or n=3:
addi $a0, $s0, 3
jal f act
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 15
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 15
The Problem & Solution
This is obviously not going to work because we are changing the registers $a0 and $ra across recursive calls. I n addition to overwriting argument registers which may later be needed, we overwrite $ra thus losing track of the address to which the calling procedure must eventually return. The solution again is to spill these registers by saving these registers on the call stack:
The caller saves all argument registers that it needs by pushing them on the stack before a procedure call (caller saved). I t restores them af ter the called procedure returns.
The callee saves the return address ($ra) when the procedure is entered by pushing it on the stack (callee saved). I t restores it before returning to the calling procedure.
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 16
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 16
Compiled Recursive Procedure
fact: #this is the address in memory where the procedure is stored
subi $sp, $sp, 8 #Make room on the stack for two registers
sw $ra, 4($sp) #push return address $ra
sw $a0, 0($sp) #push argument $a0
slti $t0, $a0, 1 #is n<1?
beq $t0, $zero, L1 #if n>=1 then go to L1
addi $v0, $zero, 1 #n<0 so return 1
addi $sp, $sp, 8 #pop the stack; don’t have to restore $ra and $a0
j $ra #now return
L1: subi $a0, $a0, 1 # n>=1. Decrement n and make recursive call with (n- 1)
jal fact
lw $a0, 0($sp) #restore original n
mul $v0, $v0, $a0 #compute n(n- 1)…..
lw $ra, 4($sp) #before we return we must restore the return address register
addi $sp, $sp, 8 #pop the stack
j $ra #now return
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 17
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 17
Stack Frame Pattern
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 18
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 18
What Else?I n our examples so f ar we assumed that the stack pointer $spdoes not change during execution of the procedure. The stackpointer is adjusted to save registers on procedure entry andreadjusted to the original value on procedure’s return. But what ifthe procedure has local variables? I n particular what if the localvariables can be declared anywhere in the body of the procedure?All of these are also created and maintained on the call stack.However as new storage is allocated on the stack the stackpointer keeps getting readjusted. I f all local variables were to bereferred to in terms of the stack pointer, the ref erence wouldhave to be readjusted each time new storage is allocated on thestack. This makes code generation cumbersome and moreimportantly the compiled code is diffi cult to understand (theoff sets keep changing).
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 1919
The Frame PointerThe solution is to use a f rame pointer ($fp). The f rame pointerpoints to the location of the fi rst variable saved on the stack (sothat this variable has a zero off set with respect to $fp) onprocedure entry. The f rame pointer remains fixed during theprocedure’s execution; and all local variables are referred to interms of $fp. Here’s the protocol:
On entry, the procedure saves the current value of $fp onthe stack (callee saved),
On entry, the procedure sets the value of $fp to the value ofthe stack pointer $sp,
Before the procedure returns, the stack pointer is adjustedback to $sp = $fp. The procedure also restores the originalvalue of the f rame pointer $fp f rom the value saved on thestack.
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 20
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 20
Frame & Stack Pointers
Before the call During the call After the call
MI PS assembler uses register 30 for the f rame pointer $fp.
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 21
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 21
Carnegie Mellon
Frame Pointer: $fp
Stack Frames
Contents Local variables Return information Temporary space
Management Space allocated when enter
procedure “Set-up” code
Deallocated when return “Finish” code
Stack Pointer: $sp
Stack “Top”
Previous Frame
Frame for
proc
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 22
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 22
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo$fp
$sp
Stack
yooyoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 23
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 23
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 24
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 24
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 25
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 25
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
amI
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 26
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 26
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
amI
amI
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 27
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 27
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
amI
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 28
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 28
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 29
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 29
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 30
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 30
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
amI
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
amI(…){ • • amI(); • •}
amI(…){ • • amI(); • •}
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 31
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 31
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo
$fp
$sp
Stack
yoo
who
yoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
who(…){ • • • amI(); • • • amI(); • • •}
who(…){ • • • amI(); • • • amI(); • • •}
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 32
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 32
Carnegie Mellon
Example
yoo
who
amI
amI
amI
amI
yoo$fp
$sp
Stack
yooyoo(…){ • • who(); • •}
yoo(…){ • • who(); • •}
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 33
Typical Microcontroller Board
Stellaris R LM3S8962 evaluation board
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 34
Interface Types
Parallel: multiple data lines for data Speed Short distance Examples: PCI (Peripheral Component Interconnect), ATA
(Advanced technology Attachment) Serial: single data line for data
Longer range than parallel Examples: USB, RS232, I2C, SPI, PCI-Express
Synchronous: there is a clock signal between transmitter and receiver
Examples: USB, I2C, SPI Asynchronous: no clock between transmitter and
receiver Uses START/STOP bits Examples: RS232, UART
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 35
General-Purpose I/O (GPIO)
Open collector circuits are used for GPIO pins
The same pin can be used for input and output
Multiple controllers can be connected to the same bus
When processor write 1 to register, the transistor is turned on and GPIO pin is pulled low
When processor write 0 to register, the transistor is turned off and GPIO pin is pulled high Wired NOR
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology
36
RS-232 Standard
RS-232 is a common interface and supports asynchronous serial connections
RS-232 is being replaced by USB
Voltage levels for an ASCII "K" character (0x4B) with 1 start bit, 8 data bits and 1 stop bit. Read this left to right corresponding to how bits are transmitted on the line:0100 1011= 0x4B.
DB9
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 37
Universal Asynchronous Receiver Transmitter (UART)
Converts 8-bit parallel data to serial data & vice versa The UART provides hardware support for
Parallel-to-Serial and Serial-to-Parallel conversion Start and Stop Bit framing Parity Generation Baud-Rate Generation (2400-115.2kbps)
UART supports Interrupts Transmit Complete Transmit Data Register Empty Receive Complete
Serial interface specification (RS232C) Start bit 6,7,8,9 data bits Parity bit optional Stop bit
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 38
UART Register Interface
CPU uses UART registers to interact with UART UDR (UART Data Register)
CPU writes byte to transmit CPU reads byte received
USR (UART Status Register) Rx/Tx complete signal bits
UCR (UART Control Register) Interrupt enable bits Rx/Tx enable bits Data format control bits (e.g. optional parity bit)
UBRR (UART Baud Rate Register) Baud rate generator division ratio
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 39
UART Transmission
Send a byte by writing to UDR register UART sets TXC bit in USR when the final
bit has finished transmitting UART triggers Tx Complete interrupt if
enabled in the UCR CPU must wait for current byte to finish
transmitting before sending the next one
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 40
UART Receive
How does the CPU know a byte has arrived? Two methods available: Polling: poll the RXC bit in USR or Interrupt: enable the Rx Complete interrupt and
write an ISR routine to handle it Read received bytes from the UDR register
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 41
UART Baud Rate
Set by UBRR (Baud Rate Register) UBRR (0-255) BAUD=fCK/[16*(UBRR+1) fCK is the crystal clock frequency
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 42
Interfacing I/O Devices
How does the CPU interface I/O devices? How is data transferred to/from memory? Role of operating system (OS)
provides system calls for accessing devices (e.g., read, write, seek)
protects one user’s data from another handles interrupts from devices provides fair I/O access to users and maximize
throughput
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 43
I/O Instructions
Addressing: the CPU must be able to address individual device’s registers
Command interface: the CPU must use instructions to send commands to I/O devices
Two techniques Isolated I/O: special instructions for I/O (e.g., IN,
OUT) memory-mapped I/O: same instruction set as for
memory references (e.g., LOAD, STORE)
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 44
Isolated I/O
Separate instructions for memory and I/O references IN R1, device_Address
Separate memory & I/O address space either physically separate bus (shown in diagram above) or same physical bus with a signal to indicate memory or I/O
Used in Intel 80x86
CPU
CPU
Memory
Memory
Interface
I/OPeripheral
I/OPeripheral
Interface
I/OPeripheral
I/OPeripheral
Pro
cessor-
mem
ory
bu
s
Independent I/O bus
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 45
Memory-Mapped I/O
Common memory & I/O bus Same instruction set for memory access & I/O
e.g., LOAD R1, 0(R5): R5 maps to an external I/O register Same address space for memory & I/O More prevalent than isolated I/O: used in RISC processors
CPU
CPU
Memory
Memory
Interface
I/OPeripheral
I/OPeripheral
Interface
I/OPeripheral
I/OPeripheral
Common Memory & I/O bus
ROM
RAM
I/O
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 46
Polling Main loop uses each I/O device periodically If output is to be produced, produce it If input is ready, read it
Example:USR (UART Status Register)
Rx/Tx complete signal bits
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 47
Send on a Polled UART I/0
Loop until TX buffer is empty (6th bit of Status register is set to 1)
Write Data register with your data.
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 48
Send a Byte Sequence
The lower the I/O speed the more CPU cycles are wasted. As CPU clock rate increases, there is more in polling penalty.
(8/57600)*(18000000)=2500 cycles
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 49
Receive With Polling
Loop until RX buffer is full (8th bit of Status register is set to 1)
Why?
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 50
Interrupt-Driven I/O
External hardware alerts the processor that input is ready Processor suspends what it is doing Processor invokes an interrupt service routine (ISR) ISR interacts with the application concurrently
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 51
Control Flow in Absence of Interrupts
<startup>inst1
inst2
inst3
…instn
<shutdown>
Processors do only one thing: From startup to shutdown, a CPU simply reads and executes
(interprets) a sequence of instructions, one at a time This sequence is the CPU’s control flow
Physical control flow
Time
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 52
Altering the Control Flow
Up to now: two mechanisms for changing control flow: Jumps and branches Call and returnBoth react to changes in state within the program
Insufficient for a useful system: A useful system must also react to changes in system state (CPU + peropherals)
data arrives from a disk or a network adapter user hits Ctrl-C at the keyboard System timer expires
System needs mechanisms for “exceptional control flow”
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 53
Exceptional Control Flow
Exists at all levels of a computer system Low level mechanisms
Change in control flow in response to a system event (i.e., divide by zero)
Hardware interrupts Higher level mechanisms
Process context switch OS system calls
Exception categories Asynchronous Synchronous
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 54
Asynchronous Exceptions (Interrupts)
Caused by events external to the processor Indicated by asserting the processor’s interrupt pin
which is examined by the processor after executing each instruction
Processor completes execution of “current” instruction
Interrupt handler returns to “next” instruction in the original code
Examples: I/O interrupts
Arrival of a packet from network Arrival of data from a disk
Hard reset interrupt Hitting the reset button
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 55
External Exception Interface
ProcessorProcessorPriorityencoderPriorityencoder
Level 1
Level 2
Level 3
Level 4
Level 5
Level 6
Level 7
I0
I1
I2
000: no interrupt010: level 2 int.
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 56
Synchronous Exceptions Caused by events that occur as a result of executing an
instruction: Traps
Intentional and planted by design Examples: system calls, breakpoint traps, special instructions After handling, control must return to “next” instruction
Faults Unintentional but possibly recoverable Examples: page faults (recoverable), protection faults
(unrecoverable), floating point exceptions Either re-executes faulting (“current”) instruction or aborts
Aborts Unintentional and unrecoverable Examples: memory parity error Aborts current program
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 57
Exception Handling
Triggers A level change on an interrupt request pin Software writing to an interrupt pin configured as an
output (“software interrupt”) Executing a special “SysCall” instruction
Responses Disable interrupts Push the program counter onto the stack so the
program can resume where it was interrupted Execute the instruction at a designated address in
memory Design of interrupt service routine
Save and restore any registers it uses Re-enable interrupts before returning from interrupt
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 58
Saving the Processor State
Address Instruction
0x1230 add R0, R1, R10x1234 div R3, R2, R00x1238 add R1, R1, R20x123C addi R2, R2, 1
PC: 0x1234
Exception PC (EPC): 0x1234
CauseValue
• All actions performed byprocessor before enteringexception service routine• Interrupts disabled
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 59
Invoking Exception Service Routine
ExceptionHandler0x80000080
Instruction memory
EPC: 0x1234
Address Instruction
0x1230 add R0, R1, R10x1234 div R3, R2, R00x1238 add R1, R1, R20x123C addi R2, R2, 1
PC: 0x80000080
CauseValue
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 60
Exception Service Routine
ExceptionHandler(){if (cause == Arithmetic Overflow) ArithmeticOverflowHandler();else if (cause == DivideByZero) DivideByZeroHandler();else if (cause == Illegal Instruction) IllegalInstructionHandler();else if (cause == external interrupt) InterruptHandler();……………………………….}
Cause Register
00cause
Overflow12
Breakpoint9
Store addr. error5
Load addr. error4
Ext. interrupt0
Example Cause Values
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 61
Trap Example: Opening File
User calls: int open(filename, options) Function open executes system call instruction via __libc_open
OS must find file, get it ready for reading or writing Returns integer file descriptor as a handle to the user
User Process OS
exception
open filereturns
intpop
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 62
Fault Example: Page Fault
User attempts write to a memory location That portion (page) of user’s memory
is currently on disk
Page handler must load page into physical memory Returns to faulting instruction Successful on second try
int a[1000];main (){ a[500] = 13;}
User Process OS
exception: page faultCreate page and load into memoryreturns
store
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 63
Abort Example: Invalid Memory Reference
Page handler detects invalid address Sends SIGSEGV signal to user process User process exits with “segmentation fault”
int a[1000];main (){ a[5000] = 13;}
User Process OS
exception: page fault
detect invalid address
store
signal process
Suppose address 5000 has not been mapped
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 64
External Timers
Programmable Interval timer (PIT) Counts down from some value to
zero and then triggers an interrupt The initial timer value is set by
writing to a memory-mapped register
It can be configured to trigger repeatedly by HW without software ISR restarting it
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 65
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 66
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 67
Volatile Keyword Use
• An optimizing compiler decidesthat no one in the body of the Program is changing foo.• So it transforms the programto an infinite loop.• But foo may be a memory-mappedI/O or changed by an interrupt routine. It may change external to the program.
The volatile keyword tells the compliernot to optimize this code. Compiler leavesthe code unchanged.
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 68
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 69
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 70
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 71
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 72
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 73
User Mode vs. System (aka Privileged or Kernel) Mode
Operating system kernel executes in the privileged mode
has unrestricted access to all system resources protects user programs from each other (e.g., memory
protection) protects system against malicious use (all user access
to system resources is via system calls) User programs run in user mode with controlled
access to system resources via system calls Exception handling is done in system mode
because unrestricted access is required
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 74
The Path Of I/O Transfer
In both polled I/O & interrupt-driven I/O, the path for data transfer is through the processor registers
For high-performance systems and high-bandwidth I/O peripherals both techniques are inefficient
Alternative: Direct-Memory Access (DMA) removes the processor from the data transfer path
a limited form of multiprocessing (DMA is a specialized processor)
Common Memory & I/O bus
RegistesRegistes
Processor
ROM
RAM
I/OLOAD
STORE
Mem
ory
-map
ped
I/O
Fall 2015, arz
CS555A – Real-Time Embedded SystemsStevens Institute of Technology 75
I/O Using DMA
CPU sends device name, address, length and transfer direction to DMA controller (via memory-mapped I/O)
CPU issues start command to DMA controller DMA controller provides handshake signals to I/O device &
memory including addresses DMA controller interrupts processor when transfer is complete
CPU
CPU
Memory
Memory
Interface
I/OPeripheral
I/OPeripheral
Interface
I/OPeripheral
I/OPeripheral
ROM
RAM
I/O DMAController
DMAController
DMA
Mem
ory
-map
ped
I/O Data
transferControl