NetPDL: An extensibleer
Dipartimento di Automatica e Informatica, Politecnico di Torino, Corso Duca degli Abruzzi, 24-10129 Torino, Italy
Received 2 January 2004; received in revised form 7 February 2005; accepted 10 May 2005
block for implementing networking applications. 2005 Elsevier B.V. All rights reserved.
still implemented within applications by customcode.
One problem with a general packet-processingcomponent stems from the dierent avors ofpacket processing required by applications. For
Cambridge, UK, and the System On Chip Division of TelecomItalia Lab S.p.A., Torino, Italy.* Corresponding author. Tel.: +39 011 564 7008; fax: +39 011
564 7099.E-mail addresses: email@example.com (F. Risso), mario.
firstname.lastname@example.org (M. Baldi).
Computer Networks 50 (201389-1286/$ - see front matter 2005 Elsevier B.V. All rightsKeywords: Protocol description language; NetPDL; XML; Protocol header description
Several network applicationssuch as packetrouting, trac classication, network address
translation, packet sning, trac analysis, tracgeneration, rewalling, intrusion detectiondealwith packet headers, hence need to know packetformats. Even though packet processing iscommon to a large number of applications, atpresent no solution exists to delegate it to asingle optimized component: packet processing isq The work was partly funded by Microsoft Research,Available online 2 August 2005
Responsible Editor: C.B. Westphall
Although several applications need to know the format of network packets to perform their tasks, till now, eachapplication uses its own packet description database. This paper addresses this problem by proposing the NetPDL,an XML-based language for describing packet headers, which has the potential of enabling the realization of a com-mon, application-independent protocol description database that can be shared among several applications. Further,common functionalities related to the protocol database can be implemented in a library, which can be a basic buildingfor packet head
Fulvio Rissodoi:10.1016/j.comnet.2005.05.029XML-based languagedescription q
ter Neexample, while an application might need to lterpackets to further process only a subset of thosereceived, another one might need to modify thevalue of selected elds in each packet. A generalpacket processing component that fullls the needsof any application would be required to have alarge set of functionalities, hence a high complex-ity. Another problem stems from the high degreeof portability required. In particular, the generalpacket processing component should be executableon a large number of platforms, ranging fromhardware boxes (e.g. network switches), embeddeddevices (e.g. rewalls), and workstations, runninga wide variety of operating systems.
These problems might have so far hamperedthe development of such a component. Irrespec-tive of such problems, though, a rst step towardsmoving packet processing functions out of appli-cations consists in having a universal protocolheader database, which is shared among all appli-cations and contains packet descriptions for allnetwork protocols. Current applications use aproprietary syntax to describe packet headers,and packet descriptions are often hardwired intheir code. Consequently, supporting a new pro-tocol requires the intervention of the developersof the specic application. Ethereal  and tcp-dump , two well-known and widely deployedapplications, have even two dierent protocoldescriptions hardwired in their code: one usedwhen ltering network packets in real-time, theother one when displaying packets in a user-friendly fashion. The rst description is simple(and limited) because it is designed for high-speedoperation (ltering). The second one is very com-prehensive and the corresponding packet-process-ing engine is much slower than the one using therst description.
This paper presents the Network ProtocolDescription Language (NetPDL), an application-independent packet format description languagethat enables the creation of a universal protocoldescription databasethe NetPDL database. Oneof the main design objectives of NetPDL, unlikealternative protocol description solutions (see Sec-tion 3), is simplicity. For this reason, NetPDL isnot intended as a protocol specication tool; for
F. Risso, M. Baldi / Compuexample, it does not support the description of aprotocol temporal behaviore.g., a protocol statemachine. Instead, NetPDL is targeted to an eec-tive description of packet header formats and pro-tocol encapsulations.
NetPDL is based on the eXtensible MarkupLanguage (XML) that is becoming the preferredway for exchanging structured data between dier-ent organizations. For this reason several tools,both stand-alone programs and libraries, existfor dealing with XML documents and can be lev-eraged for NetPDL handling. Moreover, XMLdocuments are usually parsed by applications atrun-time; by following the same approach withNetPDL, the protocol header database can bedynamically changed to include new protocols orprotocol features, without even restartingapplications.
Notice that NetPDL is benecial also to appli-cations for which a generic packet processing en-gine and a shared database are not cost eective.An example is provided by applications thatperform simple operations on a small variety ofpacket headers. In these cases implementing pack-et processing within the application might be sim-pler than creating or interfacing a generic packetprocessing engine and leveraging from existingheader descriptions might seem to bring negligibleadvantages. However, basing header processingcode on NetPDL descriptions enables transparent(i.e., not requiring modications to the applica-tion code) support of newer versions of theprotocols.
Section 2 elaborates on the concept of variousapplications sharing a generic packet processingengine that operates according to protocol descrip-tions stored in a NetPDL database, where thelatter can be dynamically updated. A survey ofexisting languages for describing network protocolheaders is provided in Section 3 that highlightstheir limitations for the application contextaddressed in this work. Section 4 presents an over-view of the NetPDL language and the architec-tural choices behind it, while Section 5 providesthe details of most primitives of the NetPDL lan-guage. Section 6 gives an overview of NetPDLextensions, i.e. a set of additional primitives thatcan be used to enrich NetPDL for specic pur-
tworks 50 (2006) 688706 689poses. In particular, Section 6 presents an exten-
sion aiming at the description of how packet infor-mation should be displayed. Finally, Section 7provides performance gures of a NetPDL-basedengine implementation. Conclusive remarks arepresented in Section 8.
2. Toward NetPDL-based packet processing
Fig. 1 depicts possible scenarios for the deploy-ment of a common protocol description databaseshared by various applications. A packet process-ing engine, i.e., a NetPDL-based engine in this con-text, can be either embedded into each application(left-hand side of Fig. 1) or shared among several
690 F. Risso, M. Baldi / Computer NeNetPDLprotocol database
NetPDL. . .
Fig. 1. Relationships between applications, NetPDL protocolapplications.A NetPDL-based engine uses a NetPDL proto-
col database (NetPDL database in short), i.e. a setof XML les that contain a description of protocolheaders, to learn the structure of the packets it issupposed to process. Since the NetPDL databaseis external to both applications and NetPDL-based engines, it can be updated without requiringmodications to the code implementing them. TheNetPDL-based engine parses these XML les andcreates an internal, engine-specic, representationof protocol headers. For example, based on theprotocol description obtained from the NetPDLdatabase, a (NetPDL-based) packet ltering en-gine can pre-calculate the oset of each eld fromthe packet beginning. This will result in faster loca-
Engine (e.g. DLL)
Application 2 Application 3
Public APIdatabase, and NetPDL-based engines.tion of the requested elds within each incomingpacket. Hence, a ltering engine based on the Net-PDL language can have the same performance ofone based on custom protocol descriptions, i.e.,hardwired in the code of the ltering engine itself.Hence, performances of NetPDL-based engines donot depend on the characteristics of the languageitself, which (being XML-based) may seem ratherinecient. From this point of view, a NetPDLdescription can be compared to a Java program,which can either be compiled into native code orinterpreted at run-time. The execution timestrongly depends on the tool used (compiler/inter-preter), not on the language itself.
A NetPDL database can even be remotelystored on a centralized server accessible throughthe Internet. Geographically dispersed NetPDL-based engines can (periodically) download themost recent version of the NetPDL database, usethe contained information to build their internalstructures, and perform their processing. In thisscenario NetPDL-based engines operate accordingto up-to-date and complete protocol descriptionscontained in some external, remotely locatedXML les, while not suering performanceimpairments during packet level processing.
This paper does not focus on any specic Net-PDL-based engine, but rather on the denitionof the NetPDL database. Nevertheless, Section 7provides measurements obtained with an existingNetPDL-based engine implemented in the NetBeelibrary  with the objective of substantiating theabove statements on performance.
Implementing protocol-processing enginesbased on an external description database haslimitations when coming to second order optimiza-tions. For example, since a NetPDL-based engineworks on a per-protocol basis, optimizations thatrely on the combined presence of two or moreprotocol headers cannot be implemented within aNetPDL-based engine. Finally, more investigationis required to assess the applicability and benetsof NetPDL-based packet processing in scenarioswhere performance requirements lead to thedeployment of custom hardware optimized for aspecic set of protocols. However, such applica-tion eld is outside the scope of this work that
tworks 50 (2006) 688706focuses on software solutions.
ter Ne3. Related work
This section provides an overview of knownprotocol description languages with specicemphasis on (1) support for packet header descrip-tion, (2) support for protocol encapsulationdescription, (3) extensibility.
Libpcap , one of the most widely used packetprocessing libraries, provides a set of functionsthat allow to selectively capture packets by meansof a lter. The lter, specied in high-level lan-guage (e.g. tcp means capture only TCP traf-c), is translated into special assembly code thatis executed by a ltering engine, the BPF (BerkeleyPacket Filter)  virtual processor. Since the lteroperates by checking the value of selected packetheader elds, the protocol format must be known.Libpcap  embeds protocol denitions within itssource code, which can be (hardly) modied onlyby recompiling the library. In essence, libpcap doesnot have a language to describe protocol headers;its simple language can be used to dene a packetlter (operating on the most common protocolelds) and it cannot be used for other purposes.
One of the best-known protocol descriptionlanguages is the one deployed by Analyzer 2.0, a protocol analyzer developed at by one ofthe authors. An easily extensible C-like structureis used to describe packet header elds and proto-col encapsulation. A most notable feature of theresulting packet processing architecture is the abil-ity to both decode packets and customize theirsummary and detailed views based on externallesDescription File Format (DFF) and IndexFile Format (IFF) conguration les. However,the Analyzer 2.0 protocol description languagedoes not provide adequate support for variable-length elds and optional elds.
FALCON an evolution of the Analyzer 2.0protocol description language, notwithstanding adierent syntaxenables more complex computa-tions, variable-length elds, and optional elds tobe specied. However, FALCON provides onlyprimitives for packet decoding (e.g. packet display-ing is not taken into consideration) and its proto-col description les have poor readability from thestandpoint of a user working on them without any
F. Risso, M. Baldi / Compuspecialized (e.g. GUI) tool.The GASP (Generator and Analyzer System forProtocol) Language  is similar to Analyzers,but has a major emphasis on packet generation,rather than decoding. Like FALCON, it supportsonly header format description.
The protocol description language used by SPY provides checking primitives to validate thecorrectness of selected elds and its protocoldescription les have excellent readability. How-ever it does not provide proper support for op-tional elds. Like Analyzer 2.0, it supports somevisualization primitives (albeit quite poor ones:only a detailed view of the packet is supported,with limited customizability); however this featureis natively provided by the language, while Ana-lyzer does the same though an extension mecha-nism, which allows arbitrary future enhancements.
The protocol description language recently pro-posed in the JnetStream project  is probablythe most exible among the listed languages. Itdoes support eld format descriptions, optionalelds (through conditional primitives such as ifthenelse and more), eld value validation andvisualization directives (although the dierence be-tween summary and detailed view of the packet isnot clear). However, it does not foresee extensionsto the language, which means that new featuresnot included in the base language cannot beadded. In addition, the complete NPL (NetworkProtocol Language) description is not easy to read(despite its C-like syntax) since all directives are to-gether without a clear separation between headerformat descriptions, eld value validation andvisualization directives.
The Solidum PAX Pattern Description Lan-guage  is targeted at pattern description, a pat-tern being either a set of elds or a set of protocols.For instance, the IP/Ethernet stack is consideredthe most common pattern to check when look-ing for ICMP packets. The language is designedwith the objective of speeding up pattern matchingoperations on Solidum network processors. How-ever, protocol encapsulation description with thePAX language is cumbersome since all the possiblecombinations of the formats for a given protocolhave to be explicitly listed when describing theencapsulation of a higher layer protocol. Finally,
tworks 50 (2006) 688706 691PAX does not provide displaying and checking
also called tags, delimited by the char-acters. Each element contains a name and an
ter Neprimitives and does not properly support optionalelds....