Upload
brier
View
38
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Improving IPC by Kernel Design By Jochen Liedtke German National Research Center for Computer Science . Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC. L3 Similar to MACH - PowerPoint PPT Presentation
Citation preview
Improving IPC by Kernel Design By
Jochen Liedtke
German National Research Center for Computer Science
Presented By Srinivas Sundaravaradan
MACH µ-Kernel system based on message passing
Over 5000 cycles to transfer a short message
Buffering IPC
L3 Similar to MACH
Hardware Interrupts delivered through messages
No Ports
Design PhilosophyFocus on IPC
Any Feature that will increase cost must be closely evaluated. When in doubt, design in favor of IPC
Design for Performance A poorly performing technique is unacceptable Evaluate feature cost compared to concrete baseline Aim for a concrete performance goal
Comprehensive Design Consider synergistic effects of all methods and techniques Cover all levels of implementation, from design to code
Making IPC fasterFewer
Call / Reply & Receive NextCombining messages
Faster15 other optimizations
Architectural levelUse redesign of L3 as opportunity to change kernel design
MethodologyTheoretical minimum
Null message between address spacesreceiver is ready to receive it107 cycles to enter & leave kernel45 cycles for TLB misses172 cycles
Goal350 cyclesAchieved 250 cycles = T
Minimize system calls Why minimize system calls ?
60% of T
Traditional IPC4 system calls
Solution CallReply & Receive next
Minimize system calls
Unblocked
Blocked
Send
Receive (reply)
Send (reply)
Receive (next)
Blocked
Unblocked
Client
Server
Call
Reply and receive next
Receive
Complex Message
Direct String Data to be transferred directly from send buffer to receive buffer
Indirect String Location and size of data to be transferred by reference
Memory Object Description of a region of memory to be mapped in receiver address space (shared memory)
A Complex Message
Ways of Message TransferTwofold Message Copy
user space A -> kernel space -> user space B
LRPC mechanismshare user-level memorysecure ?does not support variable-to-variable transfer
Temporary Mapping…
Two copy message transfer costs 20 + 0.75n cycles
L3 copies data once to a special communication window in kernel space
Window is mapped to the receiver for the duration of the call (page directory entry)
kernel
kernel
copy
mapped with kernel-only permission
add mapping to space B
Temporary Mapping…
Top-levelPage table
2nd-level tables
framesin
memory
Temporary Mapping
Lazy SchedulingScheduler overhead is significant component of IPC cost
Threads doing IPC are often moved to wait queue only to be inserted back again onto the ready queue.
Lazy Scheduling avoid locking of queuesqueue manipulation is avoided
instruction execution TLB misses
Use registers for short messagesMessages are usually short !
ack/error replies from drivershardware interrupt messages
Intel 486 processor 7 general purpose registerssender info, data
May not work for CPU’s with fewer registers
Summary of OptimizationsArchitectural
System Calls, Messages, Direct Transfer, Strict Process Orientation, Thread Control Blocks
AlgorithmicThread Identifier, Virtual Queues, Timeouts/Wakeups, Lazy
Scheduling, Direct Process Switch, Short messagesInterface
Unnecessary Copies, Parameter passingCoding
Cache misses, TLB misses, Segment registers, General registers, Jumps and Checks, Process Switch
Results…
Results
ConclusionsL3’s message passing was 22 times faster than that of
MACH
Kernel redesign focused mainly on IPC
CaveatsPorts and BufferingSpecific to the architecture
Thank You !