30
Make Hosts Ready for Gigabit Networks

Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Embed Size (px)

Citation preview

Page 1: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Make Hosts Ready for Gigabit Networks

Page 2: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Hardware Requirement• To allow a host to fully utilize Gbps bandwidth, its ha

rdware system must be ready for Gbps. For example:– CPU speed

• Is Pentium 100 MHZ PC fast enough to process a large number of packets per second? (10 bits/HZ ?)

– Memory throughput• Is SDRAM’s sustained throughput large enough to move data in an

d out of it at Gbps ?

– I/O Bus bandwidth• Is 32-bit 33 MHZ PCI bus fast enough to move data at Gbps

– Network interface• Is the firmware on the NIC fast enough to process packets at Gbps?

Page 3: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Software Design• If a host’s hardware system can barely support Gbps band

width, its software system must be carefully designed so that Gbps can still be achieved for an application. For example:– NIC device driver in OS– TCP/IP Protocol stack in OS– Routing table look up in OS– Buffer system in OS– API between OS and application programs– Networking services (e.g., NAT, Firewall)– (Improving the design and implementation of software systems is

focus of our course.)

Page 4: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

The Path of Moving Data• What networking does is basically to move data from o

ne networking application residing on one machine to a networking application residing on a different machine.

• The path of moving data is:– Application -> operating system -> network interface -> net

work -> network interface -> operating system -> application program.

• Therefore, to achieve Gbps, moving data between application and operating system, and between operating system and network interface, must be performed at least at Gbps.

Page 5: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

The Cost of Moving Data• The cost of moving data is very high.

– The CPU speed has been continuously improved and now increased to 2 GHZ. However, the throughput and access speed of memory (e.g., SDRAM) remains about the same as those a few years ago.

– Therefore, the CPU now needs to wait and waste more clock cycles to access a word in memory.

– The cost of moving data now becomes increasingly high and the memory becomes the performance bottleneck.

• Therefore, the goal is to minimize the need for moving data or hide the cost of moving data.

Page 6: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Hide Memory Access Cost (1)

• Scoreboarding processor– Instructions that load data into a register do not

need to wait for the data to come back from memory, but rather mark the registers as awaiting data. (single stream)

– The processor then can continue execution.– Only if an instruction accesses the register befo

re the memory access has completed does the processor needs to stop execution.

Page 7: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Hide Memory Access Cost (2)

• Super-scalar processor– Permit independent instructions to be executed in the same

clock cycle (multiple instruction streams)

– Therefore, an instruction that is loading data from memory can be executed in parallel with an instruction that does not need this data.

Both scoreboarding and super-scalar methods benefit reading a small amount of data. They are not very useful for readinga large amount of data. Therefore, operating system shouldbe designed to minimize the number of times that a large amountof data has to be copied.

Page 8: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Host Memory Hierarchy

Good cache performance dependson good locality.

However, networking code often violates the locality assumptions.

Example: when a packet arrives,it interrupts the execution of theprocessor. This forces the processorto load new instructions. Furthermore,because the data of the packet is not in the data cache, it needs to befetched from memory.

Page 9: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

The Problem with Layered Code• Layering is a useful concept that enables network

researchers to cooperate together at the same time on different aspects of a networking problem.

• However, an implementation of protocol stacks based on strict layering often results in bad performance.– Because the upper layer does not know which format the lower

layer wants, the packets copied into the lower layer often need to be reformatted and recopied.

– Nowadays, we are seeing that more and more implementation violates the layering concept for higher performance.

• E.g., Content-aware (URL) Web switching at an IP router.

Page 10: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Reduce Memory Copy Operations

• Currently, on a UNIX host, two data copy operations are needed to move data in an application to the network interface.– Application -> OS – OS -> network interface.

Page 11: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

One-Copy Techniques• Virtual page remapping:

– The first copy can be eliminated by using virtual memory mechanism to map the pages used by the application to the pages used by the buffer in OS.

– The buffers in the application must start and end at page boundary for this mechanism to work.

• Copy-on-write:– The first copy can also be saved by COW.– If a packet needs to be copied from one domain to another domain, cop

y-on-write can be used to reduce or eliminate the copy operation.– The pages of the packet will be copied and generated only when the pa

cket is modified. Otherwise, the same pages are used in different domains.

Page 12: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

One-Copy Techniques

• Memory-mapped buffer:– The second copy can be eliminated by mapping the

memory on the NIC to a part of the system memory.– The OS then can use the mapped system memory for its

buffer area.– Therefore, when the application’s data is copied to the

buffer in the OS, effectively it is copied into the memory on NIC.

– (From PCI specification, it shows that if there is memory on a PCI card, we can map that memory to a part of the system memory.)

Page 13: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Zero-Copy Technique• Memory-mapped buffer + Virtual page remapping :

– We can first map the NIC’s buffer to the buffer in the OS (PCI hardware map operation).

– We can then map the buffer in the application to the buffer in the OS. (OS software map operation)

– Then, the buffer in the application is mapped to the NIC’s buffer.– This will result in zero-copy operation.

• Although from network performance’s viewpoints zero-copy is good, it is very difficult to use for the application.– Because now the application needs to know the hardware details,

which however should be hidden by the OS.

Page 14: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

DMA Technique• To avoid the data copy operation between the OS and the NI

C, instead of using the normal programming I/O, we can use DMA.

• Using DMA, a NIC can transfer data directly from/to memory without involving the CPU. This enables CPU to execute in parallel with the data transfer. (However, CPU may still be stalled.)

• Generally, DMA’s performance is better than PIO. However, there are some situations where PIO is preferred (e.g., doing checksuming)

• Scatter-Gather capability in DMA-based NIC is important because they can avoid data copy operations.

Page 15: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Buffer Editing• To support Gbps, the design of a buffer system should allow

buffers to be created, clipped, shared, split, concatenated, destroyed with little overhead.

• Otherwise, a packet may need to be copied to a new buffer again and again while traversing the layers of a protocol stack.– E.g., as a packet goes down/up a protocol stack when it is sent/recei

ved, more and more headers need to be prepended/stripped to/from it.

• Generally, lists or tree structures are used as the data structure to easily support the above operations.– E.g., the mbuf used in the BSD UNIX.

Page 16: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps
Page 17: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

API Design• The design of application program interface (API) can

significantly affect the data passing performance between the user application and the OS.

• Currently, the read() and write() system calls provided on UNIX allow the user to choose a buffer with arbitrary address, size, alignment, and unconstraint access to that buffer.– This makes the OS difficult to avoid the data copy operation between

the application and the OS.

• Suppose that, instead, the UNIX requires that the buffer must start and end at page boundary, the length be a multiple of page size, then copy-on-write technique can be used to avoid one copy operation.

Page 18: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Data Manipulation • Data manipulation are computations that inspect and possib

ly modify every word of data in a network packet.– E.g., encryption, compression, checksuming, presentation convers

ion, etc.

• Typically, different network layers manipulate data independently from each other.

• Each data manipulation requires the CPU to load potentially un-cached data from memory and store the inspected/modified data to memory.

• Therefore, repeated transfers need to across the CPU/memory data path multiple times, which limits and lowers the achievable maximum throughput.

Page 19: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Integrated Layer Processing• Integrated layer processing (ILP) technique can be used to

minimize the number of data transfers.• The data manipulation steps from different protocol layers

are combined into a pipeline.• A word of data is loaded into a register, then manipulated

by multiple manipulation layers while it remains in a register, then finally stored – all before the next word of data is processed.

• In this way, a combined series of data manipulations only transfer data from memory to the CPU and back once, instead of transferring the data once per distinct layer.

The difficulty is that different manipulations cannot be easily integrated.

Page 20: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Copy-Avoiding Techniques Relationship

Page 21: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

NIC to NIC Transfers

• What we have discussed so far is to reduce the number of copy operations required for sending data from the user application, through the OS, to the NIC.

• Here, we discuss the methods that can reduce the number of copy operations required for forwarding data from one NIC to another NIC. (I.e., the system functions as a routing or switching device.)

Page 22: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Techniques for NIC-to-NIC (1)• Hardware streaming (peer-to-peer)

– The maximum achievable forwarding throughput is the I/O bus bandwidth.

– The problem are that special hardware is required and the OS has no chance to inspect/modify packets.

• As a result, some processing (e.g., routing table lookup) needs to be performed by the CPU on NICs.

• However, due to economic, the CPU on NICs are often much slower than the CPU on the system.

• DMA-DMA streaming– The maximum achievable throughput is only ½ of the I/

O bandwidth. – However, packets can be inspected/modified by the OS.

• E.g., routing table lookup

Page 23: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Techniques for NIC-to-NIC (2)• OS kernel streaming

– Packets are first DMA’ed into memory.– Packets then are read from memory to the CPU for inspection or

modification.– Depending on the number of inspection/modification and the

memory system read/write throughput, the achievable maximum forwarding throughput is further limited by (memory throughput / number of read or write).

• User-level streaming– In some applications, packets may need to go up to the user level

for inspection/modification.• E.g., a Web proxy system, an email relay system, NATD

– The throughput will be further limited.

Page 24: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps
Page 25: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Latency of Small Packets

• For large packets, we care about the cost of copying them (i.e., transfer throughput).

• For small packets, however, what we care about is the latency of their transmission.

• The following three interactions between the processor and memory can affect latency:– Branch misses

– Context switching

– Interrupts

Page 26: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Branch Misses– To make the instruction pipeline full (some processors can

have up to 13 stages), most processors today fetch instructions continuously.

– Conditional branches, however, present a problem because the target instruction cannot be determined until the condition result has been computed.

– If the CPU waits for the completion of the condition testing before fetching the next instruction, the pipeline cannot be full most of the time. This will result in low CPU utilization.

– To solve this problem, most processors today try to predict the next instruction to perform.

– If the guess is wrong, the instructions that are already fetched need to be abandoned. This will also result in low CPU utilization.

Do not put too many if-then-else in your networking code.

Page 27: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Context Switches– Context switches are very expensive because they

require both new code and data be fetched from the slow memory and loaded into the processor cache.

– In a perfect system, no more than one context switch should be needed to send a packet and one context switch plus an interrupt to receive a packet.

– In micro-kernel OS, sending and receiving a packet need more context switches than a traditional UNIX kernel. (because the packet needs to traverse the application program, network server, and micro kernel domains)

For a high-speed system, macros are preferred over function calls.Function calls are preferred over threads (need to save its PC and stack) to process an incoming packet.

Page 28: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Interrupts• Interrupts are very expensive.

– They cause context switches, which in turn cause a lot of code and data cache misses.

– Sometimes the host’s priority mode needs to be changed from the user mode to privileged mode when an interrupt occurs. Changing mode, however, is a very slow operation.

• One solution is to minimize the number of interrupts.– Do not issue a receive interrupt for every incoming packet. Issue an interrupt

only when a certain number of packets have been received or a timer has expired.

– When a receive interrupt occurs, the device driver retrieves and processes as many packets from the NIC as possible.

– Do not issue a transmit interrupt for every sent packet. Issue a transmit interrupt only after a certain number of packets have been sent or a timer ha expired.

Page 29: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Receive Livelock Problem

• Can happen in an interrupt-driven kernel• This problem happens when packets arrives at the

system at high rates.• When this problem occurs, the system will spend all

of its time processing interrupts, to the exclusion of other necessary tasks.

• The result is that, under extreme conditions, no packets can be delivered to the user application or the output of the system.

• To avoid this problem, tasks and interrupts must be carefully scheduled.

Page 30: Make Hosts Ready for Gigabit Networks. Hardware Requirement To allow a host to fully utilize Gbps bandwidth, its hardware system must be ready for Gbps

Techniques to Avoid Livelock• Limit the interrupt arrival rate

– For example, when the ipintr queue is going to be full and packets are going to be dropped, we can temporarily disable interrupts. The interrupt can be re-enabled when the buffer occupancy of the ipintr queue drops a certain threshold.

• Use of polling– Poll each NIC at a fixed rate. This can limit packet processing rat

e and also provide fair resource allocation between multiple interfaces.

• Avoiding preemption– Let higher-level protocol processing (e.g., TCP/IP) be executed a

t the same level as that used by an interrupt service routine.