� An Extensible Framework for ConstructingSIP User AgentsRobert M. Arlein and Vijay K. Gurbani
A Session Initiation Protocol (SIP) user agent is an endpoint in a signalingnetwork that can send or receive SIP messages. One can build a functionaluser agent in a few hundred lines of Java* code that sets up a call betweentwo SIP phones. However, such a user agent will not fully comply with theprotocol. Writing a compliant user agent is a complex undertaking involvingthousands of lines of code. The iSURF framework greatly reduces the effortof this undertaking. iSURF uses a SIP transaction library called siptrans as itstransaction processing layer. However, iSURF can use a different transactionlibrary, and siptrans can be used in a different framework or even in a SIPproxy. In this paper, we describe the protocol requirements for a SIP useragent and how our framework facilitates building such an agent. We alsodescribe the design and architecture of both iSURF and siptrans.© 2004 Lucent Technologies Inc.
IntroductionSession Initiation Protocol (SIP) [4] is an applica-
tion layer signaling protocol that can be used for many
purposes. For example, it may be used to set up a
multimedia session among several participants, it may
be used as a bridge to legacy networks, or it may be
used to configure or implement various telephony
services. There are two major SIP entities: a proxy
and a user agent. User agents are signaling endpoints,
while proxies aid in the rendezvous of two SIP user
agents. Most SIP-based applications require special-
ized user agents. A user agent may be built from
scratch with relative ease. In fact, one of the authors
built the third-party call control application discussed
in the section “An iSURF API Example” with just 500
lines of Java* code. However, this application makes
many simplifying assumptions that make the appli-
cation non-SIP compliant. For example, the applica-
tion assumes that all the requests it sends are received.
Building a SIP-compliant application requires much
more code.
The solution is to write user agents with a SIP
library and an application programming interface
(API) that encapsulate many of the details of the SIP
protocol. There are two such standard APIs for Java—
JAIN* SIP API [2] and JAIN SIP lite [6]. Both of these
APIs are relatively low level and message based.
There are no such standard C/C++ frameworks
for building user agents. Some of the available non-
standard frameworks [5, 7] do not have a user-friendly
API (in our opinion) and, furthermore, none of these
frameworks could be integrated easily with our lower-
level, transaction processing framework, siptrans. The
siptrans framework is both stable and fully standards
compliant. We are not sure whether the other frame-
works are also stable and standards compliant. For
these reasons, we built our own framework, iSURF
Bell Labs Technical Journal 9(3), 87–100 (2004) © 2004 Lucent Technologies Inc. Published by Wiley Periodicals, Inc.Published online in Wiley InterScience (www.interscience.wiley.com). • DOI: 10.1002/bltj.20044
Panel 1. Abbreviations, Acronyms, and Terms
API—Application programming interfaceASCII—American Standard Code for
Information InterchangeABNF—Augmented Backus-Naur formDNS—Domain name serverHTTP—Hypertext Transfer ProtocoliSIP—Lucent’s SIP proxyiSURF— iSIP user agent resource frameworkMIME—Multipurpose Internet mail extensionsSIP—Session Initiation ProtocolSRV—Service, a type of DNS recordTCP—Transmission Control ProtocolTLS—Transport layer securityTU—Transaction userUA—User agentUCS—Universal Character SetUDP—User Datagram ProtocolURI—Uniform resource identifierUTF-8—UCS Transformation Format, an 8-bit
lossless encoding of Unicode characters
88 Bell Labs Technical Journal
(iSIP user agent resource framework), over the siptrans
framework. (iSIP is a proxy from which the siptrans
layer was extracted.) iSURF provides a low-level API
with configurable, higher-level routines (plug-ins) to
handle common processing, such as authentication.
Although iSURF runs over siptrans, it can be modified
to run over similar frameworks. Likewise, other higher-
level frameworks can run over siptrans.
The rest of this paper describes the iSURF archi-
tecture and API, illustrates the API with a sample ap-
plication, and describes our implementation including
siptrans.
iSURF ArchitectureFigure 1 depicts the reference SIP stack as out-
lined in RFC 3261 [4]. It also shows which layers of
the reference stack are realized by the two frame-
works we discuss in this paper. Note that, as is the
case with reference models, the cleanly separated lay-
ers often have interdependencies when realized in
software. For example, even though the reference
model depicts the lowest layer as a syntax/encoding
layer, it is obvious that in order for a SIP request to get
to it, it must have visited the transport layer. That is,
the transport layer must have allocated appropriate
resources (e.g., socket descriptors, internal buffers
associated with these sockets) and recognized the
boundaries of the SIP message before sending it to
the syntax/encoding layer.
Working from the bottom, the first layer is the
syntax and encoding layer. SIP is an American
Standard Code for Information Interchange (ASCII)
protocol; its syntax is defined by the augmented
Transaction user
Transaction
Transport handling
Syntax/encoding
Statelessproxy UAS UAC Redirect Registrar Transaction-/call-
stateful proxy B2BUAiSURF
siptrans
B2BUA—Back-to-back user agentiSIP—Lucent's SIP proxyiSURF—iSIP user agent resource framework
UAC—User agent clientUAS—User agent server
Figure 1.A SIP stack hierarchy.
Bell Labs Technical Journal 89
Backus-Naur form (ABNF) in [4]. Unlike traditional
telephony signaling protocols that are encoded and
transported in abstract syntax notation, no special
encoding/decoding software is required for SIP. A SIP
stack normally provides routines for parsing the
UTF-8 encoded ASCII stream into internal data struc-
tures for consumption by the layers above.
The transport handling layer defines how clients
and servers behave with respect to the multiple trans-
ports supported by SIP. Unlike the Hypertext Transfer
Protocol (HTTP), which runs over Transmission
Control Protocol (TCP) and transport layer security
(TLS)-over-TCP, SIP has been defined to work over
TCP, TLS-over-TCP, User Datagram Protocol (UDP),
and Stream Control Transport Protocol. Transport plu-
rality is thus an additional facet of a SIP stack. The
transport layer isolates its idiosyncrasies from the
upper layers. For instance, the transport layer will
choose to transmit a SIP message over a congestion-
controlled transport, such as TCP, if it determines that
the message is so large that sending it over UDP will
fragment it. The transport layer will place only a sin-
gle SIP message into a UDP datagram, as specified by
the standard.
The next layer, the transaction layer, is present in
all SIP entities except stateless proxies. Transactions
are a fundamental concept in SIP; a transaction is a
request sent by a client, an optional (and unlimited
number of) provisional responses followed by one or
more final responses. The ACK request is considered
part of the INVITE transaction if the final response is
a non-2xx response; otherwise, it is considered a sep-
arate transaction. The transaction layer is also re-
sponsible for matching responses to appropriate
requests and for scheduling the retransmission of
requests and responses over unreliable transports.
Siptrans implements the transaction layer and
transport layer. iSURF is built over siptrans and im-
plements the transaction user (TU) layer. As Figure 2depicts, iSURF partitions this layer into two pieces:
the iSURF core and the application. The application
builds and sends outgoing messages. The application
can delegate plug-ins to process some of these
incoming messages. The iSURF core notifies the ap-
plication’s event handlers of incoming messages as
well as other events. The iSURF core, which main-
tains states for calls and dialogs called contexts, is de-
scribed later. The iSURF API helps the application
build messages in call or dialog contexts. The rest of
this section describes this architecture in more detail.
Transaction/Transport LayersThe transport layer works as described above and
is responsible for sending and receiving messages over
the network. The transaction layer manages SIP trans-
actions, which consist of a SIP request and the result-
ing responses. To this end, it maintains a state
machine whose inputs are incoming SIP messages,
outgoing SIP messages, and timeout indications. In
addition to maintaining state, the transaction layer
does the following:
• Retransmits outgoing requests until the receipt of
a response or until a timeout.
• Handles duplicate incoming requests by sending
cached responses.
• Passes messages from the TU to the transport
layer.
• Passes messages from the transport layer to the
TU.
• Generates ACKs for incoming non-2xx class
responses to INVITE transactions.
• Rejects badly formed incoming messages.
• Passes validated incoming messages to the TU.
The TU LayerThe TU layer is responsible for responding to valid
incoming messages, rejecting invalid messages, and
generating fresh requests. To do this properly, this
layer must maintain dialog state and other states
relating to message payloads. As noted above, these
responsibilities are shared between the application
and the user agent (UA) core. The core’s responsibil-
ities are:
• Maintaining dialog and call state.
• Notifying the application of various timeouts and
possibly responding to a timeout autonomously.
For example, if directed by the application, iSURF
will retransmit successful responses to INVITEs
until the arrival of an ACK.
• Processing incoming messages delegated to iSURF
through registered plug-ins.
90 Bell Labs Technical Journal
The application’s responsibilities are:
• Handling incoming messages within event handlers
that the application registers with the iSURF core.
• Delegating all or some of the processing of in-
coming messages to plug-ins. The application pro-
visions these plug-ins as well. For example, the
application can register a plug-in to handle in-
coming OPTIONS requests.
• Originating outgoing requests and responses.
• Maintaining state, other than the state main-
tained by the framework. For example, the ap-
plication maintains media state.
iSURF APIWhile there is a place for a higher-level inter-
face that abstracts away the complications of SIP,
such interfaces cannot be used to build an arbitrary
SIP user agent. These interfaces are useful as a
supplement to a lower-level interface. The iSURF
API, like the SIP standard, is basically message
based. iSURF’s plug-in facility may be used to ele-
vate the level of the iSURF API by providing stan-
dard routines that the application registers and
provisions.
The iSURF API provides the following facilities
(the important facilities are discussed in more detail
later in this section):
• An object-oriented message representation in-
cluding constructors for building messages based
on other messages or contexts, e.g., building a
response from a request.
• Registration of event handlers to handle notifica-
tions of incoming messages, timeouts, and other
events.
Application main
Dispatcher
Event handlers
Plug-ins(Standard event handlers)
Contexts
Transaction/transport layer
Outgoingmessages/
initializeframework
Outgoingmessages Events including
incoming messages
Handlemessage ifplug-in is active
Update contextwith message
Incoming messagesand other events
Outgoingmessages
Initializeframe-work
Application layer
iSURF core layer TULayer
iSIP—Lucent's SIP proxy iSURF—iSIP user agent resource framework TU—Transaction user
Figure 2.iSURF architecture.
Bell Labs Technical Journal 91
• Methods to send a message or repeatedly send
a message at intervals, e.g., retransmitting an
INVITE response or refreshing a REGISTER.
• An interface to configure the iSURF runtime
environment.
• Application controlled timers.
• A mechanism for the application to store and re-
trieve data from messages or contexts. We show
how this facility is used in the section “An iSURF
API Example.”
• Registration of plug-ins and support for registered
plug-ins.
Events and Event HandlersThe iSURF runtime environment is basically
event driven. The application initializes the frame-
work and then optionally sends one or more SIP re-
quests. Subsequent incoming messages and other
events are then delivered to the application’s event
handlers. There are several categories of events: in-
coming messages, various timeouts, notifications of
plug-in activity, notifications of context creation and
destruction, and notifications of subscription creation
and destruction. Every SIP message that shares the
same Call-ID header is part of the same call context.
Roughly, messages in the same call context that also
share From and To headers are part of the same dialog
context. (Contexts are discussed in a later section.) The
application can infer context creation and context de-
struction from the incoming and outgoing messages.
However, the rules for context creation and destruc-
tion are quite complicated. Notifying the application
of these events can simplify the application.
Every iSURF application registers a single global
event handler. In addition, iSURF allows context-
scoped event handlers, i.e., handlers for messages in
the same context. We have not seen mention of
context-scoped event handlers in the JAIN [2] or
VOCAL [7] application frameworks. This facility can
better organize an application when message treatment
is a function of message context, as is often the case. It
is the case with third-party call control as illustrated in
the “An iSURF API Example” section.
If a context-scoped event handler is active for an
incoming message, that handler is invoked for the
message rather than the global event handler. If a
message can be handled by both a dialog-scoped and
a call-scoped event handler, only the dialog-scoped
event handler is invoked.
Message RepresentationiSURF presents an object-oriented representation
of SIP messages and message components that models
the RFC 3261 grammar. The application never directly
accesses object data but uses member functions to
manipulate the data. An application never needs to
explicitly free these objects as described below. The
iSURF message constructors simplify building mes-
sages. When a message is built from scratch, con-
structor arguments contain some of the mandatory
message headers for the message. The other manda-
tory headers are derived. For example, a constructor
for an OPTIONS request takes a source and destination
uniform resource identifier (URI) as arguments and
uses these arguments to build the Request-URI, From,
and To headers. The constructor also fills in the Via,
Max-Forwards, Call-ID, and CSeq header fields. When
the message is not built from scratch, the message
headers are inherited from another message or con-
text. For example, a response may be constructed
from a request, in which case the From header, To
header, CSeq header, Via headers, and Call-ID headers
are copied from the request to the response.
When an application sends or receives a message,
iSURF only checks those elements of the message that
relate to maintaining context information. iSURF does
not run other checks for performance reasons. For
example, iSURF checks the number in the CSeq
header of messages in dialogs, but does not check to
ensure that the Event header is not present in an
INVITE request.
Every SIP message is represented as an instance of
the Message class. This class has two subclasses:
Request and Response. Furthermore, for each re-
quest method there is a subclass of Request such
as Invite, Ack, or Options. A request whose
method is unknown to the implementation is sim-
ply represented as instances of the base Request class.
The Message class is a subclass of the Memory
Wrapper class, which handles the memory manage-
ment for instances of Message. This mechanism is
described later. The Message class contains methods
92 Bell Labs Technical Journal
for manipulating the headers and other components
of a message.
For each known header type there is an iSURF
class such as FromHeader and ToHeader. iSURF
provides methods to manipulate the various header
components. Header types unknown to the imple-
mentation are represented by the Header class,
which represents such headers simply as a header
name and header value.
iSURF represents message content and the asso-
ciated content-related headers as the Payload class
and so deviates from the standard SIP representation
in which the content-related headers are not grouped
separately. With this scheme, the Content-Length
header is never specified but is derived from the
length of the string representation of the payload and,
therefore, there can never be a discrepancy between
the actual payload length and length stated in
Content-Length. The Payload class represents the
actual payload as a string. Subclasses of Payload
provide richer representation than string for Session
Description Protocol (SDP) and for multi-part MIME
attachments. A multi-part MIME attachment is simply
represented as a list of Payloads.
Message Resending MechanismsiSURF provides a mechanism to periodically re-
send the same message. For example, iSURF can
repeatedly resend a 2xx INVITE response at appro-
priate (exponentially decaying) intervals until the
response has been acknowledged. Similarly, iSURF
can periodically send REGISTER requests or
SUBSCRIBE renewals.
State/ContextsiSURF maintains dialog and call context state. RFC
3261 describes dialog state in detail. Dialogs are estab-
lished after a provisional response or a successful final
response to an INVITE request. Extensions to the
standard describe other means of establishing a dialog.
For example, a successful response to a SUBSCRIBE
request establishes a dialog. Once a dialog has been
established, iSURF updates dialog state as additional
messages in the dialog are encountered. Certain
messages terminate a dialog. For example, a dialog that
has been established by an INVITE may be terminated
by a BYE, or certain responses to re-INVITES or to re-
SUBSCRIBES in a dialog will terminate the dialog.
Dialogs that have not been confirmed by a final
response are terminated by a subsequent failure
response.
The SIP standard does not mention call context
explicitly. However, in iSURF, a call context is estab-
lished on processing a request with a new Call-ID
value. Every message with this Call-ID is in the same
call context. If a response to that request establishes a
dialog, the call context terminates when every dialog
associated with the request terminates; otherwise, the
call context terminates when the transaction associ-
ated with the request terminates.
Applications can use contexts in several ways.
iSURF simplifies building messages in dialog contexts
and, as explained in the “Events and Event Handler”
section, the application can use context-scoped event
handlers to improve its organization. As another con-
venience, the application can store application data
in a context and retrieve the data when processing a
message in the context. Finally, the application can
access the context state itself. For example, the appli-
cation can access the list of active dialogs established
with a call.
As mentioned previously, for performance rea-
sons, iSURF only checks that incoming or outgoing
messages do not violate state constraints such as:
• A response must match a request.
• Requests in a dialog must be in order.
• It is invalid to send a re-INVITE before receiving
confirmation of the previous INVITE.
The application is responsible for all other mes-
sage validation including the validation described in
Section 8 of RFC 3261. The application can delegate
some of this responsibility by configuring and regis-
tering plug-ins to validate messages.
Plug-insPlug-ins are iSURF procedures that are config-
ured and registered by the application to partially or
completely handle incoming messages. As with inco-
ming event handlers, plug-ins may be registered
globally or in the scope of a dialog or call. We now
describe the types of plug-ins and their order of
Bell Labs Technical Journal 93
execution. Multiple plug-ins may be executed for a
single message. The first plug-in that can be executed
for an incoming message checks the message syntax
using the procedures described in Section 8 of RFC
3261. The application configures this plug-in by in-
dicating the methods it recognizes, indicating the
headers it recognizes, and so forth. If this plug-in re-
jects a message, iSURF invokes the application’s
“plug-in rejects” event handler and none of the ap-
plication’s incoming message handlers are invoked
for this message.
Otherwise, if the message is a request and if an
authentication verification plug-in is active, it is in-
voked. The application configures this plug-in with
a policy that indicates which requests to challenge. If
this plug-in generates an authorization challenge
request, iSURF invokes the application’s “plug-in chal-
lenges request” event handler and none of the appli-
cation’s incoming message handlers is invoked.
Otherwise, at most one plug-in is now invoked
for the message. These final plug-ins are specified for
requests by the request method or specified for re-
sponses by the response code. For example, there is a
plug-in to handle OPTIONS requests and a plug-in to
handle 301-Moved Permanently responses. If a plug-
in is invoked at this time, then none of the applica-
tion’s incoming event handlers are executed for the
message. After the plug-in executes, the application is
notified by either the “plug-in rejects message” or the
“plug-in handles message” event.
Memory ManagementiSURF seeks to reduce the onerous task of mem-
ory management by automatically managing the
memory for the following objects: messages, message
components, and contexts. All of these objects are
represented as subclasses of the MemoryWrapper
class, which contains a pointer to the data for these
objects including a reference count. This single pointer
is the only data member in the various subclasses.
Copy constructors, regular constructors, assignment
operators, and destructors for MemoryWrapper are
designed to implement a reference counting scheme
as well as copy-by-reference. Constructing an instance
of one of these subclasses results in allocating data for
the instance, storing a pointer to the allocated data in
MemoryWrapper, and setting the reference in the
allocated data to one. Copying a MemoryWrapper
instance is implemented by a pointer copy and an
increment of the reference count. Destroying an
instance decreases the reference count for the in-
stance. Memory is freed when the reference count
goes to zero. Strings or lists are represented using the
C++ standard template library, which automatically
manages the memory for these types.
An iSURF API ExampleWe describe a sample application that demon-
strates context-scoped event handlers and building
messages with the iSURF API. The example application
is a server that receives requests to set up SIP sessions
between two parties and, thus, implements third-party
call control. For each request, the application sends
an INVITE without a session descriptor to user A. User
A’s response contains a session descriptor, offering one
or more choices for the media protocol for the session.
The application does not ACK user A’s response at this
time but sends an INVITE to user B with a copy of
user A’s session descriptor. User B’s subsequent re-
sponse contains a session descriptor that answers the
offer from the INVITE. The application then ACKs user
B’s response and ACKs user A’s earlier response. The
application attaches the session descriptor from user
B’s response to the ACK sent to user A. Media now
flows between the two parties. When either party
wants to end the call, it sends a BYE to the application.
After responding to the BYE, the application sends a
BYE to the other endpoint. Alternatively, the applica-
tion can terminate the call. We illustrate the call flow
for our application in Figure 3. The application
terminates the call in this illustration.
The application models a call with the Call class.
The Call constructor sends out an INVITE to
party A and registers an instance of ALegHandler
for this call leg. Upon receipt of a 200-OK response,
AEventHandler :: handleIncomingResponse
sends an INVITE to party B and registers an in-
stance of BLegHandler for the call leg between
the application and party B. On receipt of a 200-OK
response from party B, BlegHandler :: handle
IncomingResponse acknowledges this response
94 Bell Labs Technical Journal
and then acknowledges party A’s previous response.
Both event handlers contain pointers to a Call
instance and contain functions to handle BYE requests
and responses. Panel 2 contains most of the applica-
tion code. It does not show the BYE handling code
and the code for failure responses.
ImplementationThe iSURF implementation is based on siptrans,
which we discuss next. The current iSURF imple-
mentation has the standard methods (INVITE, ACK,
BYE, REGISTER, OPTIONS, and CANCEL) as well as
SUBSCRIBE and NOTIFY. iSURF runs on the Solaris,*
Linux,* Windows,* and WinCE platforms. After the
siptrans discussion, we discuss how iSURF is built over
siptrans, iSURF’s performance overhead, and how to
extend iSURF with additional methods.
SiptransSiptrans is the transaction processing engine
for iSURF. Every SIP entity, with the exception of a
stateless proxy, requires a transaction manager. The
siptrans framework provides such a transaction man-
ager in the form of a multithreaded C library suitable
for being linked in with higher-layer applications (like
TUs). With reference to Figure 1, siptrans provides
the capabilities of the bottom three layers.
Siptrans provides an API consisting of eight meth-
ods and three interfaces that allows TUs to use it for
the services it offers. These interfaces and methods
are reproduced in Table I.In SIP, proxies are pure transaction processing
entities; thus, it appears reasonable that a transaction
processing library be culled from the internals of a SIP
proxy. Siptrans is refactored from a commercial grade,
User A Application User B
Invite (payload�0)
200-OK Response (payload�A)
ACK (payload�B)
ACK (payload�0)
200-OK Response (payload�B)
Invite (payload�A)
BYE
BYE
Media
200-OK Response
200-OK Response
Figure 3.Call flow for iSURF API example.
Bell Labs Technical Journal 95
Panel 2. iSURF example
struct Call;class ALegHandler : public ContextEventHandler {
Call *call;public:
ALegHandler(Call *call) { this->call = call; }void incomingInviteResponse(const Response resp);void incomingByeRequest(const Bye);
};class BLegHandler : public ContextEventHandler {
Call *call;public:
BLegHandler(Call *call) { this->call = call; }void incomingInviteResponse(const Response resp);void incomingByeRequest(const Bye);
};struct Call {
Response aLegResponse;string aUser; string bUser;string aHost; string bHost;ALegHandler a;BLegHandler b;void sendInvite(string fromUser, string toUser, string host, Payload p,
ContextEventHandler *handler) {
SipAddress sender(SipStack::getMyURI(fromUser));SipAddress receiver(SipURI(toUser, host));FromHeader from(sender);ToHeader to(receiver);StandardContactHeader contact(sender);Invite invite(to, from, contact, p);SipStack::sendMessage(invite);CallContext call(invite);call.setEventHandler(handler);
}Call(string auser, string ahost, string buser, string bhost) : a(this),
b(this){aUser = auser; bUser = bUser; aHost = ahost; bHost = bHost;//Sends an Invite to party A, sets "a" as event handler for callsendInvite(bUser, aUser, aHost, Payload(), &a);
}};void ALegHandler::incomingInviteResponse(const Response resp) {
if (resp.getResponseCode() < 200 && resp.getResponseCode() >= 300) return;call->aLegResponse = resp;//Send an Invite to party B, sets "b" as event hander for this callcall->sendInvite(call->aUser, call->bUser, call->bHost, resp.getPayload(),
&call->b);}
//Ack the response from A, Ack the previous response from B Attach A's//payload to this Ackvoid BLegHandler::incomingInviteResponse(const Response resp) {
if (resp.getResponseCode() < 200 && resp.getResponseCode() >= 300) return;Ack ack(resp);SipStack::sendMessage(ack);ack = Ack(call->aLegResponse);ack.setPayload(resp.getPayload());SipStack::sendMessage(ack);
}
96 Bell Labs Technical Journal
Table I. Siptrans interfaces.
Interface orName Method Description
siptrans_init_core Method Initializes the siptrans complex
siptrans_version_get Method Returns the version of the siptrans library
siptrans_log_set_level Method Sets the logging level for the siptrans library
siptrans_log Interface Implements the logging capabilities specific to the TU
siptrans_sip_message_to_TU Interface Invoked by siptrans library to send a SIP messageto the TU core.
siptrans_dispatch_sip_message Method Called by the TU to send a SIP message (requestor response) to siptrans for further delivery.
siptrans_schedule_alarm Method Called by TU to schedule an alarm.
siptrans_alarm_to_TU Interface Invoked by the siptrans library when an alarmset by the TU needs to be fired.
siptrans_start Method Called by TU to start the siptrans library
siptrans_wait Method Called by the TU to keep the calling thread active
siptrans_stop Method Called by TU to stop the siptrans library
SIP—Session Initiation ProtocolTU—Transaction user
RFC 3261-compliant [4] proxy developed by Lucent
Technologies. The proxy is resident on the PacketIN®
application hosting environment [1]. During its de-
velopment cycle and as part of an ongoing effort to
be compliant to the SIP specification, the proxy has
participated in six SIP interoperability events. Thus, it
was a design goal of siptrans to inherit, as much as
possible, the tested functionality of the proxy.
Siptrans alleviates much of the drudgery associ-
ated with parsing, transport management, and trans-
action handling. It also provides a general purpose
alarming facility for the TU. Once initialized, it ac-
cepts requests or responses either from its controlling
TU or from the network. Further handling of the SIP
request (or response) depends on who initiated it, the
controlling TU or a peer across the network. This is
described in the next section. Siptrans fully imple-
ments the client state machines corresponding to the
INVITE and non-INVITE requests (Figures 5 and 6,
respectively, of RFC 3261 [4]) as well as the server
state machines corresponding to the INVITE and non-
INVITE requests (Figures 7 and 8, respectively, of RFC
3261 [4]).
Siptrans and request handling. A request can arrive
at siptrans from one of two sources: a peer across
the network or the TU. For example, siptrans relays a
request from the network to the TU or siptrans relays
a request from the TU to its intended destination on
the network.
• Requests arriving from the network. If a request ar-
rives at siptrans from the network, siptrans parses
it and creates a transaction if none exists. The
parsed request is subsequently passed to the TU
through the Siptrans_sip_message_to_TU
interface. This is an interface that is provided by
the TU programmer and invoked by siptrans on
receiving a SIP request (or response). Siptrans does
not make any assumption of the details of this in-
terface; it only requires that the TU eventually in-
vokes the Siptrans_dispatch_sip_message
Bell Labs Technical Journal 97
method to pass a final response corresponding to
the request. The response will be added to the SIP
transaction maintained by siptrans and transmitted
out to the network.
• Requests arriving from the TU. Another source of
incoming requests to siptrans is the TU. When
the TU desires to send out a request, it creates
one and presents it to siptrans by invoking the
Siptrans_dispatch_sip_message method.
Siptrans creates a transaction and analyzes the
Request-URI or Route header (if present) to de-
rive the network addresses that the request
should be sent to. Siptrans supports, in part, the
SIP procedures required to locate SIP servers [3]
and can thus query domain name server (DNS)
service (SRV) resource records (RR) for the TCP
and UDP transports as well as DNS A RR queries
(it does not yet support DNS naming authority
pointer [NAPTR] RR queries). It builds a list of
preferred servers if DNS SRV RR lookup is suc-
cessful and starts sending the request to the
servers until a server responds. If no server re-
sponds, siptrans creates a local failure response
and sends it to the TU, thus meeting the request-
response expectations of the TU.
Siptrans and response handling. Reliability and
retransmission characteristics differ in SIP for INVITE
and non-INVITE transactions. Non-2xx responses to
INVITE are retransmitted by the transaction layer
until an ACK is received (and absorbed by the trans-
action layer); 2xx responses for the INVITE are re-
transmitted by the TU and ACKs for such responses
are passed on to the TU. Furthermore, responses to
non-INVITE requests are not retransmitted at all, but
are cached on the server side and retransmitted only
upon the receipt of a duplicate request. For siptrans,
a response can arrive from one of two sources: the
TU or a peer across the network. A response origi-
nated by the TU is sent out on the network by siptrans
to its intended destination, whereas a response from
the network is presented to the TU.
• Responses arriving from the network. Assuming a
request from the TU was delivered to the appro-
priate peer, any responses to the request are
matched to the appropriate transaction by
siptrans and forwarded to the TU by invoking the
Siptrans_sip_message_to_TU interface.
Handling of responses in SIP differs for INVITE
and non-INVITE requests. First, retransmitted 2xx
responses for the INVITE are passed to the TU,
even if a transaction does not exist (as mandated
by RFC 3261 [4]). Second, INVITE requests re-
quire an additional method, ACK, to complete;
whereas, non-INVITE requests terminate when a
final response is received. Following the appro-
priate SIP state machines, if siptrans receives a
non-2xx response for an INVITE, it will generate
and send an ACK to the peer across the network
and present the non-2xx response to the TU.
Siptrans will not generate an ACK for a 2xx-class
response to the INVITE. For all non-INVITE
requests, siptrans will cache the response and
present it to the TU. Retransmissions of the
non-INVITE request will elicit the cached
response.
• Responses arriving from the TU. The TU invokes the
Siptrans_dispatch_sip_message method
to pass a response to siptrans. Siptrans simply
follows the Via list in the response to send it on-
ward (in SIP, responses are routed based on the
Via list).
Siptrans handles error conditions such as when
the peer UA has closed a TCP connection and cannot
receive a response on that connection. In such cases,
siptrans attempts to open a new TCP connection to
the peer UA in order to send the response. If a new
connection is successfully established, siptrans will
send the response; otherwise, there is precious little
the transaction layer can do besides clean up the
transaction.
Siptrans and alarm. In addition to the transaction
and transport handling features, siptrans also pro-
vides a general purpose alarming facility to the TU.
The TU can use this facility to schedule alarms for
retransmitting 200-OK responses to an INVITE re-
quest, for instance. However, the alarming facility is
designed to be as generic as possible so that arbitrary
alarms can be scheduled and executed. The APIs pro-
vide for the TU to store an opaque data item in the
alarm, which is returned to it when the alarm fires.
98 Bell Labs Technical Journal
This allows the TU to save state in the alarm for
reuse when the alarm fires. To schedule an alarm,
the TU invokes the Siptrans_schedule_alarm
method; siptrans, in turn, invokes a TU programmer
supplied interface, Siptrans_alarm_to_TU, when
the alarm fires.
Building iSURF Over Siptrans or Other FrameworksThere are a number of issues when building
iSURF over a transaction framework:
• The transaction framework’s memory management
scheme. iSURF must know when memory refer-
ences from the transaction framework become in-
valid. If iSURF needs to retain data obtained from
the transaction framework after the data is freed,
it must copy the data before it is freed. In the case
of siptrans, iSURF was able to integrate the sip-
trans reference counting scheme with its own.
• Transaction layer timeouts. iSURF delivers transaction
layer timeouts to the application as 408-transac-
tion timeout response messages. Siptrans cre-
ates these response messages and passes these
messages to the iSURF core, which in turn passes
them to the application. If some other transaction
framework notified iSURF of the timeout through
another mechanism, then iSURF would have to
build the timeout response messages itself.
• Layering issues. Although the standard states the
responsibilities of the various SIP layers, imple-
mentations are free to reassign these responsibil-
ities. For example, siptrans is responsible for
placing Vias in outgoing requests and siptrans
determines the destination Internet Protocol
address for outgoing requests. If some other
transaction framework did not assume these
responsibilities, then the iSURF core would have
to assume them.
• Threading. iSURF needs to know the threading
structure of the transaction framework to avoid
deadlocks and to efficiently make its code thread
safe.
PerformanceWe investigated iSURF’s performance by writing a
client application that repeatedly initiates and termi-
nates sessions at a server application. We implemented
the client/server application pairs two ways: by using
the iSURF API and by using the siptrans API. We then
compared the performance of the two implementa-
tions. The iSURF client application underperformed
the siptrans client application by 10%. However, the
iSURF server application outperformed the siptrans
server application by 10%. We believe the iSURF
server application outperformed the siptrans server
application because the siptrans server copied head-
ers from requests to responses. Because of iSURF’s
sophisticated memory management, it merely copied
header references from requests to responses. There
was less opportunity of exploiting this advantage in
the client application. Regardless, iSURF does not im-
pose a large overhead over its siptrans transaction
layer.
Extending iSURFAdding another request type to iSURF involves
the following:
• Adding a new class representing the request
method. If a header must appear in a new request,
then every constructor for the class should include
this header as an argument. A constructor that
allows the new request to be built in a dialog
should also be added to the class.
• Adding a corresponding function to its event han-
dler class if there are any timed events associated
with the request. For example, when extending
iSURF to handle SUBSCRIBE-NOTIFY, we de-
clared an event handling function that is called
when a SUBSCRIBE expires.
• Implementing a periodic refresh for the request
if necessary. This was done for the SUBSCRIBE
request.
• Adding classes for any new headers necessary for
the request.
• Adjusting iSURF’s dialog state maintenance rou-
tines if a transaction involving the new request
effects dialog state.
• Providing access routines to dialog state related
to the new request. For example, iSURF
provides the application access to details on
active subscriptions created by the SUBSCRIBE
request.
Bell Labs Technical Journal 99
• Determining whether to add timers that prevent a
transaction or dialog from being stuck in the same
state indefinitely. We added such a timer, so that
an INVITE transaction cannot be stuck in the
proceeding state indefinitely.
Conclusions/Next StepsiSURF is a framework for building arbitrary user
agents. The goal of the framework is to ease the effort
of writing a user agent without surrendering general-
ity. We believe we have succeeded in this effort.
Moreover, we believe the performance penalty of
achieving this ease of effort is manageable.
At the basic message passing level, the message
representation is easy to learn, as it largely follows
the SIP grammar. The message constructors guide
the application writer in building correct messages.
Moreover, iSURF relieves the application of memory
management duty. iSURF provides optional higher-
level constructs that facilitate application building:
• iSURF helps to dispatch incoming messages to
context-based event handlers.
• iSURF allows applications to store application data
in contexts or messages.
• iSURF provides message resending/refresh
operations.
• iSURF has configurable plug-ins that handle many
phases of processing incoming messages.
iSURF is constructed using siptrans, a SIP transac-
tion library. The library has been designed to be com-
patible with any TU that implements the interfaces
detailed in Table I. iSURF is one such TU; we have also
been successful in using siptrans in another unrelated
project within Lucent. Siptrans is portable across Solaris
and Linux platforms. It has also been successfully ported
to the pSOS* real-time operating system and WinCE.
Work continues on both iSURF and siptrans. We
are investigating increasing iSURF’s performance by
speeding up siptrans itself and by better integrating the
iSURF and siptrans memory management. We are also
investigating storing iSURF’s core state persistently to
help application writers build reliable user agents.
*TrademarksJAIN and Java are trademarks and Solaris is a registered
trademark of Sun Microsystems, Inc.
Linux is a registered trademark of Linus Torvalds.
pSOS is a trademark of Wind River Systems, Inc.
Windows is a registered trademark of MicrosoftCorporation.
References[1] Y. Chen, O. B. Clarisse, P. Collet, M. A.
Hartman, L. Rodriguez, L. Velazquez, and B. A.Westergren, “Web Communication Servicesand the PacketIN Application HostingEnvironment,” Bell Labs Tech. J., 7:1 (2002),25–40.
[2] P. O’Doherty and M. Ranganathan, “JSR 32:Jain SIP API Specification,” Aug. 2003,<http://jcp.org/en/jsr/detail?id=32>.
[3] J. Rosenberg and H. Schulzrinne, “SessionInitiation Protocol (SIP): Locating SIPServers,” IETF RFC 3263, June 2002,<http://www. ietf.org/rfc/rfc3263.txt?number=3263>.
[4] J. Rosenberg, H. Schulzrinne, G. Camarillo,A. Johnston, J. Peterson, R. Sparks, M.Handley, and E. Schooler, “SIP: SessionInitiation Protocol,” IETF RFC 3261, June 2002,<http:// www.ietf.org/rfc/rfc3261.txt?number=3261>.
[5] K. Singh, J. Lennox, S. Narayanan, and H.Schultzrinne, CINEMA: Columbia InterNetExtensible Multimedia Architecture, ColumbiaUniversity, NY, Nov. 2002, <http://www1.cs.columbia.edu/~library/TR-repository/reports/reports-2002/cucs-011-02.pdf>.
[6] D. Tweedie, “JSR 125: JAIN SIP Lite,” July2002, <http://jcp.org/en/jsr/detail?id=125>.
[7] Vovida, Vovida Open CommunicationsApplication Library (VOCAL), Apr. 2003,<http://www.vovida.org/applications/downloads/vocal/>.
(Manuscript approved May 2004)
ROBERT M. ARLEIN is a member of technical staff in theServices Infrastructure Research Departmentat Lucent Technologies in Murray Hill, NewJersey. He holds B.S. and A.M. degrees inmathematics from the University ofWisconsin in Madison and an M.S. degree in
computer science from New York University in NewYork City. He is currently investigating the uses of SIP-based components in a converged service network.Mr. Arlein holds one patent and has one applicationpending.
100 Bell Labs Technical Journal
VIJAY K. GURBANI is a distinguished member oftechnical staff in the Wireless NextGeneration Architecture and EvolutionDepartment at Lucent Technologies inNaperville, Illinois. He holds B.Sc. and M.Sc.degrees in computer science from Bradley
University in Peoria, Illinois. He is a Ph.D. candidate incomputer science at the Illinois Institute of Technologyin Chicago. Mr. Gurbani is currently involved in thespecification, prototyping, and implementation ofservices based on SIP. His research interests are Internettelephony services, Internet signaling protocols,pervasive computing in the telecommunicationsdomain, distributed systems programming, andprogramming languages. He is a member of the ACMand the IEEE Computer Society. He holds one patentand has four applications pending. �