Flexible Packet Filtering

8/6/2019 Flexible Packet Filtering

1/12

Flexible Packet Filtering: Providing a Rich Toolbox

Kurt J. Lidl Zero Millimeter LLC

Potomac, [email protected]

Deborah G. LidlWind River Systems

Potomac, [email protected]

Paul R. BormanWind River Systems

Mendota Heights, MN [email protected]

Abstract

The BSD/OS IPFW packet ltering system is a well engineered, exible kernel framework for ltering (accepting,rejecting, logging, or modifying) IP packets. IPFW uses the well understood, widely available Berkeley Packet Filter(BPF) system as the basis of its packet matching abilities, and extends BPF in several straightforward areas. Since therst implementation of IPFW, the system has been enhanced several times to support additional functions, such as rateltering, network address translation (NAT), and trafc ow monitoring. This paper examines the motivation behindIPFW and the design of the system. Comparisons with some contemporary packet ltering systems are provided.

Potential future enhancements for the IPFW system are discussed.

1 Packet Filtering: An Overview

Packet ltering and packet capture have a long historyon computers running UNIX and UNIX-like operatingsystems. Some of the earliest work on packet captureon UNIX was the CMU/Stanford Packet Filter [CSPF].Other early work in this area is the SunNIT [NIT] deviceinterface. A more modern, completely programmableinterface for packet capture, the Berkeley Packet Filter(BPF), was described by Steve McCanne and Van Ja-cobson [BPF]. BPF allows network trafc to be cap-

tured at a network interface, and the packets classiedand matched via a machine independent assembly pro-gram that is interpreted inside the kernel.

1.1 BPF: An Overview

BPF is extremely exible, machine independent, rea-sonably high speed, well understood, and widely avail-able on UNIX operating systems. BPF is an interpreted,portable machine language designed around a RISC-likeLOAD/STORE instruction set architecture that can beefciently implemented on modern computers.

BPF only taps network trafc in the network interfacedriver. One important feature of BPF is that only pack-ets that are matched by the BPF program are copied intoa new buffer for copying into user space. No copy of the packet data needs to be made just to run the BPFprogram. BPF also allows the program to only copyenough of a packet to satisfy its needs without wastingtime copying unneeded data. For example, 134 bytes issufcient to capture the complete Ethernet, IP, and TCPheaders, so a program interested only in TCP statistics

might choose to copy only this data.

A packet must be parsed to determine if it matches agiven set of criteria. There are multiple ways of doingthis parsing, but a great deal of it amounts to lookingat a combination of bits at each network layer, beforethe examination of the next layer of the packet. Thereare multiple data structures designed for efcient repre-sentation of the parsing rules needed to classify packets.BPF uses a controlow graph (CFG) to represent the cri-teria used to parse a packet. The CFG is translated into

a BPF machine language program that efciently prunespaths of the CFG that do not need to be examined duringthe parsing of a packet.

Ultimately, a standard BPF program decides whethera packet is matched by the program. If a packet ismatched by the program, the program copies the spec-ied amount of data into a buffer, for return to the userprogram. Whether or not the packet was matched, thepacket continues on its normal path once the BPF pro-gram nishes parsing the packet.

BPF also has a limited facility for sending packetsout network interfaces. BPF programs using this fa-

cility must bind directly to a particular network inter-face, which requires that the program know what inter-faces exist on the computer. This allows for sending anytype of network packets directly out an interface, with-out regard to the kernels routing table. This is how therarpd and dhcpd daemons work on many types of UNIX computers.

BPF, as originally described, does not have a facil-


2/12

ity for rejecting packets that have been received. BPF,although described as a lter , can match packets, copythem into other memory, and send packets, but it cannotdrop or reject them.

2 Motivation

The need for a powerful and exible packet matchingand ltering language had been evident for a long time.The basic ideas for the BSD/OS IPFW system were theresult of several years of thought about whatfeatures andfunctions a packet ltering systemmust provide. Havinghighly exible packet ltering for an end system wouldbe mandatory, and that same ltering system should beapplicable for ltering trafc that was being forwardedthrough a computer.

The immediate need for a exible packet lteringframework came from a desire to run an IRC client ina rigidly controlled environment. This environment con-sisted of a daemon that could be run in a chrootd direc-tory structure, as well as a highly restrictive set of packetlters. These lters could not just prevent unwantedinbound packets, but perhaps more importantly, couldalso discard unwanted outbound packets. The BSD/OSIPFW system was thus originally intended to lter boththe inbound and outbound trafc for a particular host.

Many of the most popular contemporary packet l-tering systems of the initial design era (circa 1995)were incapable of ltering packets destined for the lo-cal computer or originating from the local computer.The available ltering systems concentrated on lteringtrafc that was being forwarded through the computer.Other major problems with the existing packet lter-ing systems were the inability to do signicant statefulpacket forwarding and unacceptably low performance[screend]. screend does keep track of IP fragments,which is a limited form of stateful packet ltering.

Further motivation for a exible packet ltering sys-tem was the lack of any other standard packet lteringin the stock BSD/OS system of that era. There wascustomer demand for a bundled packet ltering system,which was not fully met by the other widely availablepacket ltering systems [iplter]. screend , the most

widely available contemporary packet ltering system,provided many good lessons in packet ltering technol-ogy. The BSD/OS IPFW system was designed with thelessons learned from screend in mind.

A consideration for the implementation of a new,exible packet ltering framework was the realizationthat as the Internet grew, the number of attacks fromother locations on the Internet would also grow. Hav-ing a powerful matching language tied to the ltering

capabilities would allow for a single BSD/OS computeracting as a router to protect any other computers behindthe lter.

3 Need for Flexibility

An early design decision for IPFW was that the sys-

tem should present as exible a matching and lteringframework as possible. As few ltering rules as possi-ble should be directly embedded in the kernel. As muchas possible, ltering conguration and policy should beinstalled into the kernel at runtime, rather than compiledinto the system. This decision has reaped many bene-ts during the lifetime of this system. Because IPFWis extremely exible, it has been applied to many prob-lems that were not in mind at the time it was designed.To borrow Robert Scheiers quote about the X WindowSystem Protocol, IPFW is intended to provide mecha-nism, not policy. [RFC1013]

4 Other Packet Filters

As was noted earlier, there were several other packetltering technologies when IPFW was rst envisioned.In the years since, other ltering technologies have beendeveloped, some specic to a particular operating sys-tem and others available on a variety of platforms. Acomparative analysis with these other packet lters al-lows one to more fully appreciate the exibility of IPFW.

One of the most important differences between IPFWand these other ltering systems is that IPFW actuallydownloads complete programs to be evaluated against

the packets. The other ltering systems are all rules-based. By evaluating an arbitrary program, an entirelynew methodology of packet ltering can be installedwithout rebooting the system. In a rules based system,any new type of rules requires code changes to the l-tering system, as well as a reboot to make it active. Dy-namic loadable kernel modules can approximate the pro-gram download facility as modules could be replacedwith new ltering rule capabilities without requiring asystem reboot.

4.1 Darren Reeds iplter

The iplter package is available on many versions of many UNIX-like operating systems, from BSD/OS toolder systems such as IRIX to small-footprint systemslike QNX to frequently updated systems like FreeBSD.It supports packet ltering, provides a Network Ad-dress Translation (NAT) implementation, and can per-form stateful packet ltering via an internal state table.Like the other examined packet lters, iplter is a rulesbased system. It can log packet contents to the pseudo-device ipl. [iplter] [iplterhowto]


3/12

4.2 FreeBSDs iprewall System

FreeBSD provides a packet ltering interface, knownas iprewall. This system is often referred to as ipfw ,which is the name of the management command.

Thisis a rules based packet ltering mechanism, which ismanipulated internally by socket options. There is anadditional kernel option ( IPDIVERT ) to add kernel di-vert sockets, which can intercept all trafc destined fora particular port, regardless of the destination IP addressinside the packet. The divert socket can intercept eitherincoming or outgoing packets. Incoming packets can bediverted after reception on an interface or before next-hop routing. [FreeBSD]

4.3 Linux 2.2: ipchains

The Linux ipchains implementation provides threedifferent services: packet ltering, NAT (called mas-querading), and transparent proxying. The packet l-

tering capabilities are based on having chains of rules,which are loaded at three different ltering locations: in-put, forward and output. Each chain location can havemultiple rules appended, inserted or deleted from thatlocation. The rules are relatively simple and allow forchaining to another named rule if a particular criteria ismatched. Arbitrary data inspection of packets is not per-mitted. [ipchains]

4.4 Linux 2.4: iptables

The Linux iptables implementation (sometimes re-ferred to as netlter) is a complete rewrite and exten-

sion of the ipchains ltering system. Substantial cleanupand xing of multiple idiosyncrasies in handling howpackets destined for the local computer are processedhave been made. Support for stateful packet ltering hasalso been added to the system. The command line syn-tax for specifying packet headers for each rule has beenchanged since the ipchains release. A QUEUE disposi-tion for a packet has been added, which species thatthe packet will be transferred to a user process for addi-tional processing, using an experimental kernel module,ip queue . [iptables]

4.5 OpenBSDs pf System

OpenBSD 3.0 includes pf, a packet lter pseudo-device. As a rules-based lter, users are restricted tothe available set of rules included with pf. Manipu-lation of the pf pseudo-device is managed through thepfctl command. Internally, the system is controlledby ioctl calls to the pf device. Rules can be appliedon an in or out basis, and can be tied to a specic in-terface as well. As a very new packet ltering mecha-nism (it was written from scratch, starting in June 2001)

it does not have an established track record, and is stillundergoing change. [OpenBSD]

4.6 TIS Firewall Toolkit

The TIS Firewall Toolkit (fwtk) and other proxy re-walls not only examine the source and destination of

packets, but also the protocol being sent. New applica-tion proxies that understand the protocol must be writtenfor each new type of service. While this approach doesallow for additional levels of security as the proxy canwatch for attack methods that exploit a particular proto-col, it requires a much deeper understanding of each newprotocol before ltering that type of trafc. [fwtk]

5 Design Elements

Several elements of the overall design and implemen-tation of the BSD/OS IPFW system are worth a de-tailed examination. Some of the more interesting design

choices are discussed below.

5.1 BPF Packet Matching Technology

Because of the many ne matching properties of BPFsystem, as noted in Section 1, it was selected as the coretechnology for packet matching and classication in theBSD/OS IPFW system.

5.2 Download Filter Programs into Kernel

The concept of downloading lters into the kernelwas not a novel idea. The IPFW author was familiar

with a few obscure packet lter technologies that hadthe lter coded directly into the network stack.

Whilehighly inexible in operation, this type of lter systemdid make an attacker work harder when attempting tosubvert or weaken an installed lter. The marginal se-curity benet of a lter compiled into the kernel wasdwarfed by the numerous advantages of a downloadablepacket lter. Early versions of IPFW had the abilityto both password protect lters as well as make down-loaded lters immutable. Both of these features wereeventually dropped as the additional security providedonly came into effect once the computer running thelter was compromised. Once the computer has been

compromised to that extent, the added security was notconsidered to be valuable enough to warrant the costs of maintaining the implementation.

5.3 IPFW Kernel Socket

Prudent reuse of kernel facilities is always a goalwhen designing a new subsystem for the UNIX kernel.The BSD/OS IPFW system needed a method for trans-mitting data about packets and lter programs from the


4/12

kernel to programs running in userspace. In some otherhistoric packet lters, this would have been done viathe ioctl system call, which requires some articialle to open. Adding a new system call for this purposemight be justied, but every new system call is generallyviewed with suspicion.

Instead of adding a new system call, a new instantia-tion of a kernel socket was made. A new pseudo-IP pro-tocol was dened, which is accessed via a raw internetdomain socket. Because sockets were dened to providean efcient mechanism of moving streams or packets of data to and from the kernel, they are appropriate for thetask of moving data about packet ltering to a user ap-plication. In the case of the IPFW socket, the data isalways generated by the kernel.

The raw IPFW socket provides important function-ality in a standard interface with which programmersare familiar. The socket interface also provides for zero

(or many) readers of the data. An IPFW lter can sendpackets or data about packets it has matched back to auserspace program, regardless of the nal disposition of the packet. The userspace program may then log thepacket, or it might further process a rejected packet, in-cluding re-insertion of a possibly modied packet back into the network via a raw IP socket.

High precision timestamps, in the form of a time-spec structure, are available on packets read from thekernel socket, if the user has requested them. This times-tamp is added during the logging operation, so the userapplication does not have to worryabout getting an accu-rate timestamp when it reads the packets from the socket.

IPFW uses the sysctl system call to pass informa-tion about lter programs back and forth between thekernel and userspace programs. sysctl is used to copythe lter programs into the kernel as they are installed.It is also used for gathering statistics about the IPFW l-ters installed on the computer. The sysctl interface isanother example of a exible programming paradigm. Itprovided a natural expression of hierarchy that was eas-ily expandable, did not require articial les to open andreused an existing kernel interface.

5.4 Multiple Filtering Points

One of the unique features of the BSD/OS IPFW sys-tem at the time it was designed was the inclusion of mul-tiple ltering points in the kernel. The original BPF sys-tem only allowed for tapping of packet trafc at eachphysical interface. The BSD/OS IPFW system providesve logical points where lters may be installed in thekernel.

Table 1 lists each ltering location in the kernel. Each

Location Modify? Default Actionpre-input yes acceptinput no rejectforward yes rejectpre-output yes acceptoutput no reject

Table 1: IPFW Filtering Locations

ltering location has an associated default action. Whena ltering location has at least one lter installed, if noexplicit disposition for a packet is provided by the lter,the default action will be applied to the packet. The pas-sage of packets through the various ltering locations isdescribed in detail in Section 6 of this paper.

5.5 Stackable Filters

Each ltering point in thekernel is actually the attach-ment point for a stack of lter programs. Filter programscan easily be pushed onto the stack, popped off the stack,or inserted into the middle of the stack for each lteringpoint. Individual lters each have a priority (a signed 32bit number) that determines where the in the stack thelter is actually placed. Multiple lters installed at thesame priority, at the same ltering location, operate as atraditional stack.

Filters may also have a symbolic tag to aid in theiridentication, replacement, or deletion.

5.6 Flexibility of ActionsAfter classifying a packet according to whatever rules

are in place, a packet ltering system has to perform anoperation on the packet. A simple packet ltering sys-tem has just two operations, accept and reject. TheBSD/OS IPFW system has three additional operations.The log action takes a specied amount of the packetand copies it to the IPFW kernel socket. The call ac-tion allows the current packet to be passed to a differentnamed lter for further processing. The next actioncalls the next lter in the stack of lters installed at thecurrent lter location. In addition, the BSD/OS IPFW

system allows packets to be modied explicitly by thelter program, or as the consequence of calling anotherlter program. The classic accept and reject actionshave been extended so they can also optionally log thepacket to the kernel socket.

5.7 Filter Pool

In addition to the explicit ltering points in the kernela pool of lter programs can be installed into the ker-


5/12

nel, not associated with a particular ltering point. Thisallows common lter programs to be installed into thelter pool and then be referenced from any of the otherlters installed in the running system. Currently onlyBPF based lters have the ability to call a lter from thepool. The lter called may delete the packet or return avalue associated with the packet. Typically this value isboolean. The called lter might also be used to recordsome state that can later be accessed.

Unlike BPF programs, it is possible to create an in-nite loop of called lters. There is no loop detectionin the lter software, which could be considered a aw.Users of the IPFW system are obligated to understandthe interactions between all their lter programs.

5.8 Circuit Cache

Although BPF lters themselves are stateless, by us-ing custom coded lters, such as the circuit cache, the

lters can access saved state about a connection. Thecircuit cache provides the system with two features. Therst is the ability of a BPF program to request the circuitdescribed by a packet be added to the cache. A circuit isdened as the combination of the source and destinationaddresses, along with the source and destination portsfor the upper level protocol, if relevant. The second isthe ability to pass a packet to the cache for it to determineif that session has been seen before. For example, TCPpackets can be divided into Initial SYN packets andEstablished packets. Initial SYN packets are subjectto potentially complicated rules to determine if the ses-sion should be allowed. If the packet is to be accepted,it is passed to the circuit cache asking for an entry tobe added for its circuit. Any Established packet is sim-ply passed to the circuit cache for query. If the packetdoes not match an existing session, it is rejected. Thecircuit cache understands the TCP protocol and whencaching TCP circuits it can optionally monitor FIN andRST packets and automatically terminate a circuit whenthe TCP session is shut down. Circuits may also auto-matically be timed out to reclaim kernel resources aftera conguration period of inactivity.

5.9 Custom Coded Filters

While BPF-based lters are the most exible andcommonly used lters within the BSD/OS IPFW sys-tem, they are not the only method of dening a lter.There are a variety of custom coded lters available.Custom coded lters are C modules that are compiledinto the kernel. These typically provide a very rigid setof ltering capabilities. Some non-BPF lters includedwith IPFW can be used to write traditional, rules basedlters. These non-BPF lters may not be able to make

IPFW

incoming packet outgoing packet

pre-input

forward

output

input

localpacket

forwarded packet

NetworkInterface

pre-output

Local Host

Figure 1: IPFW Filtering Locations

use of the advanced features of IPFW due to limitationsin their design. Several examples of custom coded ltersare described in Section 10 of this paper.

5.10 Transparent Proxying

IPFWs ability to force any packet to be delivered tothe local computer allows for the creation of transpar-ent proxies for multiple services. An additional smallchange to the TCP stack in BSD/OS complements thisability. The SO BINDANY socket option allows a pro-

gram to listen on a particular port, and bind to whateverIP address for which the connection request was origi-nally intended. This happens regardless of whether theIP address is bound to one of the computers interfaces.This support makes writing transparent proxies straight-forward.

6 How it Works

IPFW operates on Internet Protocol (IP) packets thatare received or sent by the computer running IPFW. Ingeneral there are three types of packets: packets thatwere sent to the computer, packets that were generated

by the computer, and packets for which the computer isacting as a forwarder.

When a packet arrives on the computer (the packet iseither sent to this computer or this computer is forward-ing the packet), the network driver copies that packetinto an mbuf. If the packet is an IP packet, it is placed ona queue of IP packets to be processed by the kernel. Theinterface on which the packet arrived is also recorded inthe mbuf and can be retrieved by any called IPFW lter.


6/12

The ip input() routine in the kernel then de-queues the packet, performs sanity checks on the packetand determines the destination for the packet. If thedestination is the local computer, the kernel will per-form packet reassembly. IP packets may be brokeninto smaller packets (fragmented) if a network elementin the path between the source and destination is notable to handle the entire packet as a single datagram.Finally, once the packet is complete, the kernel willqueue the packet on the correct IP protocol queue (suchas TCP or UDP). Packets that are to be forwardedare not re-assembled. These packets are sent on toip forward() and eventually on to ip output()for transmission to the destination.

When IPFW is used, ip input() will call the pre-input lter chain, if present, just after performing basicsanity checks. This ltering is performed prior to de-termining the destination of the packet. Because verylittle examination of the packet has been performed, andno extra state about the packet is stored in the kernel,it is safe for the IPFW lter to modify the packet con-tents. It may even force a packet to be delivered tothe local computer, even if the destination address doesnot match the address of any of the interfaces on thecomputer. The most basic modication is to delete thepacket, which causes ip input() to stop processingthe packet. Allowing modication of any type at thispoint allows for various specialty lters such as NAT andpacket reassembly. Packet reassembly can be performedexplicitly by calling the rewrite named lter. Packetreassembly is useful so that following lters will alwayssee complete IP packets and not IP fragments. The abil-ity to modify the packet is the reason that the pre-inputlter point was added to IPFW.

Once any lters on the pre-input lter point have beenexecuted, ip input() continues with normal process-ing, which is to determine the destination of the packet.If the packet is to be delivered locally, then processingcontinues normally up until the point where the packetwould be queued for an upper level protocol. The fullyformed packet is passed to the input lter chain. Nomodication of the packet contents is allowed at thispoint as signicant sanity checks have been performedon the packet. The packet may still be dropped, logged,

or both dropped and logged. Once the packet completesthe inputlter processing, it is either discarded (rejected)or queued for an upper level protocol as normal.

Received packets that are not to be delivered locallyare to be forwarded and are passed to ip forward() .The ip forward() routine determines if the packetcan be forwarded by the computer. This decision is madeby ensuring a route exists for the destination address and

the packets time to live has not expired. The packetis then passed to the forward lter chain. The forwardlters have access to the interface indexes for both theinput and probable output interfaces. It is possible forthe output interface to change between ip forward()and ip output() , though typically this is not thecase. Knowledge of the input and output interfaces pro-vides assistance in ltering packets with spoofed ad-dresses. The forward lter, like the pre-input lter, isallowed to modify the packet. The main restriction onmodications is that a forwarded packet should not bemodied into a local packet. The packet should eitherstill be destined for an external computer after modi-cation or it should be deleted. Once the forward l-ter chain has been called the rest of ip forward()is executed and eventually the packet is passed on toip output() .

Packets passed to ip output() are either locallygenerated or being forwarded through this computer. Inboth cases, ip output() veries that a route exists forthe destination address and (re)determines the destina-tion interface for the packet. The pre-output lter chainis then called. This lter, much like the pre-input lter,may modify the packet. In addition, it may specify a dif-ferent IP address to be used for the next-hop routing of this packet. This override of the next-hop routing desti-nation is done through an out-of-band mechanism. Thiscapability allows the pre-output lter to actually deter-mine which interface the packet should be sent out whenthere are multiple possible output interfaces. If an IPaddress is provided via the out-of-band method, or thedestination IP address inside the packet is changed, therouting lookup is repeated. The pre-output lter is notcalled a second time.

For forwarded packets, all ltering is now complete.For packets that were locally generated the output l-ter chain is called immediately after the pre-output l-ter. Like the input lter chain, the packet may not bemodied by the output lter chain. The ip output()routine will eventually call the network interfaces out-put routine. If IPFW rate ltering (as discussed in Sec-tion 10) is being used, the ip rateoutput() routineis actually called instead of the interfaces output rou-tine. The ip rateoutput() routine is responsible

for eventual delivery of the packet to the network inter-face or dropping of the packet.

7 BPF Language Overview

The most used and most exible lter type in IPFWis the BPF lter. As mentioned earlier, this type of l-ter uses the BPF pseudo-machine. The BPF pseudo-machine has been enhanced for use with IPFW. Only one


7/12

totally new BPF instruction was added for IPv4 packetprocessing. A new memory type was added, as well asthe ability to modify the packet being processed. IPv6enhancements have been added and are discussed at theend of this section.

The new BPF instruction, CCC , enables the calling of

a lter on the call lter chain. While it might seemthat the acronym stands for Call Call Chain, it wasactually derived from Call Circuit Cache. The circuitcache was the reason for the creation of the call chain.The CCC instruction returns the result of the call in theA register.

The new memory type is called ROM and is an addi-tional memory area to the original BPF memory spaces.The original memory spaces included the packet con-tents as well as the scratchmemory arena. While the rstimplementation did in fact store read only information,the term ROM is now a misnomer as the ROM locations

can be modied by the lter. This space, called promin the source code, is used to pass ancillary informationin and out of the BPF lter.

While the bpf filter() function does not haveany innate knowledge of the meaning of these memorylocations, IPFW assigns meanings to several locations:

0 IPFWM_AUX An auxiliary returnvalue (for errors)

1 IPFWM_SRCIF The index of the sourceinterface (if known)

2 IPFWM_DSTIF The index of the

destination interface(if known)3 IPFWM_SRCRT The index of the

interface for returnpackets

4 IPFWM_MFLAGS The mbuf flags5 IPFWM_EXTRA Bytes of wrapper that

preceeded this packet6 IPFWM_POINT What filter point was

used7 IPFWM_DSTADDR New address to use for

routing to destination

The BPF lter is intelligent about setting these val-ues. As some of these values, such as IPFWM SRCRT ,can be expensive to calculate, the lter is examined whenpassed into the kernel. A bitmap is built of all ROM lo-cations referenced by the program and only those loca-tions are initialized.

In order to support the ROM memory space, the call-ing convention of the bpf filter() function waschanged to pass three additional parameters:

int32_t *prom; /* ptr to ROM memory */int promlen; /* count of valid bytes */

/* in the memory space */int modify; /* boolean to indicate */

/* whether packet *//* can be modified */

/* by bpf_filter() */

All existing calls to bpf filter() were modied topass NULL , 0 , 0 for these three values.

IPFW has been adapted for use with IPv6. This work was implemented with the NRL version of IPv6. Morerecent releases of BSD/OS use the KAME IPv6 imple-mentation. The changes to support IPFW in the KAMEIPv6 stack have not yet been written.

In order to support IPv6, several other new enhance-ments were made to the BPF pseudo-machine. Triple

length instructions were added. A classic BPF instruc-tion is normally 64 bits in size: 16 bits of opcode, two8 bit jump elds, and a 32 bit immediate eld. A triplelength instruction has 128 bits of additional immediatedata (the length of an IPv6 address). A new register,A128 , was also added. The load, store, and jump in-structions now have 128 bit versions. The scratch mem-ory locations have been expanded to 128 bits, thoughtraditional programs only use the lower 32 bits of eachlocation. An instruction to zero out a scratch memorylocation ( ZMEM) was added. Because BPF was not ex-tended to handle 128 bit arithmetic, a new jump instruc-tion was created that allowed for the comparison of the

A register to a network address, subject to a netmask.The netmask must be specied as a CIDR style netmask,specically a count of the number of signicant bits inthe netmask.

ROM locations only have 32 bit values and it is in theROM that a new destination routing address is passed.Currently it is not possible to use the next-hop routingcapability with IPv6.

8 IPFW Filtering Language

Initially BPF lters were written in BPF assembly

with the aid of the C pre-processor ( cpp ). It was thoughtthat many assembly fragments would be written for var-ious needs and that the nal lter would include thesefragments. It was quickly determined this was not avery user friendly way of programming lters. It yieldedopaque lters such as:

// IP header length into Xldx 4 * ([0] & 0xf)// Protocol of packet


8/12

ld [9 : 1]// Is it UDP? Jump to L1 if notjeq #17 - L1// Move ip length into Atxa// Add 8 bytes to skip UDP headeradd #8

jmp L3L1:

// Is it TCP? Jump to L2 if notjeq #6 - L2// Load TCP flags into Ald [x + 13 : 1]// Jump to L11 if SYN bit is setjset #2 L11 -// If SYN is not set, just accept itret #IPBPF_ACCEPT

L11:// Move ip length into Atxa// Add 20 bytes to skip TCP header

add #20jmp L3

L2:// Just move ip header length into Atxa

L3:or #(IPBPF_ACCEPT | IPBPF_REPORT)// Accept the packet and report itret A

A new language was clearly needed.

Existing lter-ing languages were of little help as they were rules basedand not programmatic. The ability to use the program-

able features of BPF was a key design goal of IPFW.Since BPF does not allow reverse jumps, there is no fa-cility for loop constructs. This results in two possibleconstructs: a sequence of instructions, and if/then/elseclauses. The IPFW ltering language was designed withthis in mind. The general form of the language is:

condition {true action

} else {false action

}

The false action is optional and typically omitted innormal lter programs. Note that if is implied. Ini-tially else was also implied, however, this reducedreadability so it was added back into the language.

In addition to this generic construct, there is a block statement, which is essentially a series of if andelse if statements. There is also a case statementwhich is similar, but not identical, to a C switch state-ment.

Most actions are either another construct or a termi-nating condition, such as accept or reject.

9 End-Users Perspective

From the end-users perspective, creating a packet l-ter involves writing a text le that contains the lter,compiling the lter from the command line, and load-ing the compiled lter into the kernel from the commandline. A user could create the following sample lter in ale called forward :

#define SERVER 192.168.1.10#define MAILHOST 192.168.1.15

switch ipprotocol {case tcp:

// TCP packets should never come in// as fragmentsipfrag {

reject;}// TCP packets need at least 20 bytesiplen (


9/12

}accept;

}// UDP packets need at least 8 bytesiplen(


10/12

as a ltering gateway between the different client net-works.

The ltering host has a forward lter that allows in-bound access for a select number of protocols, to arigidly dened list of servers on the different client net-works. This lter is rather complicated, as it must treat

each of the client networks attached to the ltering com-puter with a different set of ltering options, speciedfor each of the clients.

The ltering host has a pre-input lter that terminatesoutbound http requests on the ltering computer, so thata transparent http cache can operate for all the differentclient networks behind this host. There is also an inputlter in use to prevent external users from connecting tothe http cache. The input location ltering is done toprevent any inbound TCP connections from ever beingestablished directly to the cache daemon.

The ltering host also has a set of rate lters, to limithow much bandwidth certain network protocols (for ex-ample, nntp) are allowed to use at any given time. Therate ltering capability has proven very useful for simu-lating low-bandwidth links for testing during debuggingof network tools that must transfer large quantities of streaming information over long-haul networks. Someof the server computers behind the ltering host also userate lters to limit the amount of outbound http trafcthat may be sent. This is used to enforce bandwidth lim-its to which the clients have agreed.

The ltering host also has a set of lters in place to l-ter connections from the nimda worm rst noted in au-

tumn 2001. This worm attacks web servers and attempts14 different accesses, probing for known security holes.While none of the servers at this location are vulnera-ble to this worm, the attacks still cause a great numberof extraneous log entries to be made on each web server.There are so many log entries from the attack probes thatthe normal log entries are drowned out in the noise of theattack entries.

Three lters are used to combat this worm. The rstis the system provided, custom lter called rewrite,which allows sending RST packets to one end of a TCPconnection. The second lter is a custom, hand-coded

BPF assembler lter that is installed into the lter poolwith the name src dst swap. This lter swaps thesource and destination IP addresses in a packet, as wellas the source and destination port addresses. The nallter is an IPFW lter that examines the interior of pack-ets destined for the protected clients web servers. If the lter detects the nimda attack signature after theTCP session has gone through the three-way handshake,the lter calls the src dst swap lter and then calls the

rewrite lter. The rewrite lter sends a TCP RSTpacket to the internal web server, which will cause it totear down the just established TCP session. By reset-ting the TCP connection ononly the protectedclient webserver, but not that of the attacking computer, the clientweb servers kernel resources are immediately freed forreuse. The attackers kernel resources are intentionallyleft occupied. By resetting the TCP connection beforeany data is sent to the web server, no error message islogged by web server. This neatly solves the problemof the extraneous log messages caused by the nimdaworm. It could be trivially enhanced to deal with othersuch attacks in the future.

11.3 Corporate Firewall

The last sample installation of the IPFW system isa special purpose corporate rewall. A very successfulrewall has been built around the IPFW system to dy-namically protect a group of servers from malicious di-alup users. It uses the IPFW kernel socket and a simplelter at the forward location to read all TCP SYN packetscoming from a list of CIDR blocks that represent the di-alup modem pools. A continually updated database of IPaddresses assigned to legitimate dialup users is queriedto determine if the SYN packet should be allowed. If thepacket is permitted, a copy of the packet is sent out a dif-ferent interface, to the servers to allow establishment of the clients TCP session. Other non-TCP packet lter-ing is also done on these rewalls using regular packetmatching and ltering.

12 Future Enhancements to IPFW

There are always ample possibilities for expandinguseful software systems, and IPFW is no exception.Under consideration are structural additions as well aschanges to make IPFW more accessible to a larger num-ber of people.

Allow loadable kernel modules to implement Cus-tom Coded Filters. This would allow for the speed of a native implementation of a complex lter, while pre-serving the ability to recongure the ltering system onthe y.

Implement an in-kernel lter compiler that takesBPF programs as input and generates a native (non-interpreted) version of the downloaded BPF lter for ex-ecution.

On-demand logging would make it easier to debuglters. This would also allow the assessment of the costand benets of making certain system changes.

IPFW support needs to be added back into the KAMEIPv6 protocol stack. The IPv6 support needs to be com-


11/12

pleted, so that next-hop routing can be used with IPv6addresses.

Expansion of the protocol throttling support. Thiswould allow limiting the number of responses for a par-ticular protocol and provide another method for prevent-ing denial of service attacks.

While the programmable nature of IPFW makes it ex-tremely exible, programming is not a strength of allsystem administrators. This lack of programming expe-rience means that some lters are inefcient and othersdo not take full advantage of IPFW features. A graphicaluser interface could address these issues.

13 Conclusions

The IPFW framework has proven to be extremelyexible. It has been used for purposes never dreamedof in the original design. This is often the hallmark of a

good basic design.IPFW has a complete framework for packet ltering

services. It has been extended several times since theoriginal implementation but has never needed to be com-pletely redesigned.

IPFW is very reliable. It has been deployed in ap-plications that have passed terabytes of information be-tween reboots. Machines executing IPFW lters haveuptimes in excess of one year, even after multiple lterchanges and updates during that time period.

Appropriate documentation is crucial to wide accep-

tance of a new technology. While the IPFW system hasbeen available for several years and has many power-ful and useful features, its acceptance has been slow be-cause of incomplete and opaque documentation.

14 Availability

Additional documentation, as well as some of the lters described in this paper, is available athttp://www.pix.net/software/ipfw/ .

15 Acknowledgments

The contributions of several people are hereby ac-knowledged in their efforts to make this a better paper.Jack Flory and Mike Karels for numerous conversationsabout packet ltering prior to the rst line of the IPFWcode being written. Bill Cheswick for prompting theneed for the circuit cache, which led to the ability tochain lters and the ability to call other lters from alter. Dave MacKenzie, Josh Osborne, and Chris Ross,for reading draft copies and making useful suggestions.

Donn Seely, for guiding this paper through the submis-sion and review process. The staff at the USENIX Asso-ciation, for they make this conference possible.

Notes

Even though FreeBSDs iprewall is often referredto as ipfw it is unrelated to the BSD/OS IPFW system,except in name.

In order to change the lter, one would have to re-compile the kernel and reboot the computer with the newkernel.

When work on coding IPFW started, the authorsearched for the BPF assembler that was described inthe original BPF paper. After learning that no actual as-sembler was ever written, a standalone BPF assemblerwas written for IPFW.

A language similar to the IPFW ltering lan-guage was independently developed for the Ascend GRFrouter. While the GRF used BSD/OS as the basis of itsembedded operating system, the GRF ltering systemwas independently developed.

References

[BPF] McCanne, Steve, and Van Jacobson, The BSDPacket Filter: A New Architecture for User-levelPacket Capture, Proceedings of Winter 1993USENIX Annual Technical Conference. (January,1993).

[CSPF] Mogul, J.C., and R.F. Rashid, and M.J. Ac-cetta. The packet lter: An efcient mechanism for user-level network code. 11th ACM Sympo-sium on Operating System Principles, November,1987.

[FreeBSD] The FreeBSD Project. ipre-wall(4); FreeBSD 4.4-stable Manual Page.http://www.freebsd.org, June, 1997.

[fwtk] Trusted Information Systems, Inc.TIS Internet Firewall Toolkit ftp://ftp.tislabs.com/pub/rewalls/toolkit/, March,

1997.[ipchains] Russell, Rusty.

http://netlter.samba.org/ipchains/, October,2000.

[iplter] Reed, Darren. http:///www.iplter.org/,November, 2001.

[iplterhowto] Conoboy, Brendan and Erik Ficht-ner. IP Filter Based Firewalls HOWTO


12/12

http://www.obfuscation.org/ipf/ipf-howto.pdf,July, 2001.

[iptables] Russell, Rusty. http://netlter.samba.org/,August, 2001.

[NIT] Sun Microsystems, Inc. NIT(4P); SunOS 4.1.1

Reference Manual. Mountain View, California,October, 1990. Part Number 800-5480-10.

[OpenBSD] The OpenBSD Project. pf(4); OpenBSD3.0 Manual Page. http://www.openbsd.org, July,2001.

[RFC1013] Scheier, Robert W. X Window SystemProtocol, Version 11. Cambridge, Massachusetts,June, 1987.

[screend] Mogul, Jeffrey C. Using screend to imple-ment IP/TCP security policies. Technical NoteTN-2, Digital Equipment Corporation Network

Systems Lab, 1991.

Documents

Flexible Packet Filtering