19
Improving IPC by Kernel Design By Jochen Liedtke German National Research Center for Computer Science Presented By Srinivas Sundaravaradan

Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

  • View
    228

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Improving IPC by Kernel Design By

Jochen Liedtke

German National Research Center for Computer Science

Presented By Srinivas Sundaravaradan

Page 2: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

MACH µ-Kernel system based on message passing

Over 5000 cycles to transfer a short message

Buffering IPC

L3 Similar to MACH

Hardware Interrupts delivered through messages

No Ports

Page 3: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Design PhilosophyFocus on IPC

Any Feature that will increase cost must be closely evaluated. When in doubt, design in favor of IPC

Design for Performance A poorly performing technique is unacceptable Evaluate feature cost compared to concrete baseline Aim for a concrete performance goal

Comprehensive Design Consider synergistic effects of all methods and techniques Cover all levels of implementation, from design to code

Page 4: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Making IPC fasterFewer

Call / Reply & Receive NextCombining messages

Faster15 other optimizations

Architectural levelUse redesign of L3 as opportunity to change kernel design

Page 5: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

MethodologyTheoretical minimum

Null message between address spacesreceiver is ready to receive it107 cycles to enter & leave kernel45 cycles for TLB misses172 cycles

Goal350 cyclesAchieved 250 cycles = T

Page 6: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Minimize system calls Why minimize system calls ?

60% of T

Traditional IPC4 system calls

Solution CallReply & Receive next

Page 7: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Minimize system calls

Unblocked

Blocked

Send

Receive (reply)

Send (reply)

Receive (next)

Blocked

Unblocked

Client

Server

Call

Reply and receive next

Receive

Page 8: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Complex Message

Direct String Data to be transferred directly from send buffer to receive buffer

Indirect String Location and size of data to be transferred by reference

Memory Object Description of a region of memory to be mapped in receiver address space (shared memory)

A Complex Message

Page 9: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Ways of Message TransferTwofold Message Copy

user space A -> kernel space -> user space B

LRPC mechanismshare user-level memorysecure ?does not support variable-to-variable transfer

Page 10: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Temporary Mapping…

Two copy message transfer costs 20 + 0.75n cycles

L3 copies data once to a special communication window in kernel space

Window is mapped to the receiver for the duration of the call (page directory entry)

kernel

kernel

copy

mapped with kernel-only permission

add mapping to space B

Page 11: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Temporary Mapping…

Top-levelPage table

2nd-level tables

framesin

memory

Page 12: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Temporary Mapping

Page 13: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Lazy SchedulingScheduler overhead is significant component of IPC cost

Threads doing IPC are often moved to wait queue only to be inserted back again onto the ready queue.

Lazy Scheduling avoid locking of queuesqueue manipulation is avoided

instruction execution TLB misses

Page 14: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Use registers for short messagesMessages are usually short !

ack/error replies from drivershardware interrupt messages

Intel 486 processor 7 general purpose registerssender info, data

May not work for CPU’s with fewer registers

Page 15: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Summary of OptimizationsArchitectural

System Calls, Messages, Direct Transfer, Strict Process Orientation, Thread Control Blocks

AlgorithmicThread Identifier, Virtual Queues, Timeouts/Wakeups, Lazy

Scheduling, Direct Process Switch, Short messagesInterface

Unnecessary Copies, Parameter passingCoding

Cache misses, TLB misses, Segment registers, General registers, Jumps and Checks, Process Switch

Page 16: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Results…

Page 17: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Results

Page 18: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

ConclusionsL3’s message passing was 22 times faster than that of

MACH

Kernel redesign focused mainly on IPC

CaveatsPorts and BufferingSpecific to the architecture

Page 19: Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar

Thank You !