An extensible framework for constructing SIP user agents

  • Published on

  • View

  • Download


  • An Extensible Framework for ConstructingSIP User AgentsRobert M. Arlein and Vijay K. Gurbani

    A Session Initiation Protocol (SIP) user agent is an endpoint in a signalingnetwork that can send or receive SIP messages. One can build a functionaluser agent in a few hundred lines of Java* code that sets up a call betweentwo SIP phones. However, such a user agent will not fully comply with theprotocol. Writing a compliant user agent is a complex undertaking involvingthousands of lines of code. The iSURF framework greatly reduces the effortof this undertaking. iSURF uses a SIP transaction library called siptrans as itstransaction processing layer. However, iSURF can use a different transactionlibrary, and siptrans can be used in a different framework or even in a SIPproxy. In this paper, we describe the protocol requirements for a SIP useragent and how our framework facilitates building such an agent. We alsodescribe the design and architecture of both iSURF and siptrans. 2004 Lucent Technologies Inc.

    IntroductionSession Initiation Protocol (SIP) [4] is an applica-

    tion layer signaling protocol that can be used for many

    purposes. For example, it may be used to set up a

    multimedia session among several participants, it may

    be used as a bridge to legacy networks, or it may be

    used to configure or implement various telephony

    services. There are two major SIP entities: a proxy

    and a user agent. User agents are signaling endpoints,

    while proxies aid in the rendezvous of two SIP user

    agents. Most SIP-based applications require special-

    ized user agents. A user agent may be built from

    scratch with relative ease. In fact, one of the authors

    built the third-party call control application discussed

    in the section An iSURF API Example with just 500

    lines of Java* code. However, this application makes

    many simplifying assumptions that make the appli-

    cation non-SIP compliant. For example, the applica-

    tion assumes that all the requests it sends are received.

    Building a SIP-compliant application requires much

    more code.

    The solution is to write user agents with a SIP

    library and an application programming interface

    (API) that encapsulate many of the details of the SIP

    protocol. There are two such standard APIs for Java

    JAIN* SIP API [2] and JAIN SIP lite [6]. Both of these

    APIs are relatively low level and message based.

    There are no such standard C/C++ frameworks

    for building user agents. Some of the available non-

    standard frameworks [5, 7] do not have a user-friendly

    API (in our opinion) and, furthermore, none of these

    frameworks could be integrated easily with our lower-

    level, transaction processing framework, siptrans. The

    siptrans framework is both stable and fully standards

    compliant. We are not sure whether the other frame-

    works are also stable and standards compliant. For

    these reasons, we built our own framework, iSURF

    Bell Labs Technical Journal 9(3), 87100 (2004) 2004 Lucent Technologies Inc. Published by Wiley Periodicals, Inc.Published online in Wiley InterScience ( DOI: 10.1002/bltj.20044

  • Panel 1. Abbreviations, Acronyms, and Terms

    APIApplication programming interfaceASCIIAmerican Standard Code for

    Information InterchangeABNFAugmented Backus-Naur formDNSDomain name serverHTTPHypertext Transfer ProtocoliSIPLucents SIP proxyiSURF iSIP user agent resource frameworkMIMEMultipurpose Internet mail extensionsSIPSession Initiation ProtocolSRVService, a type of DNS recordTCPTransmission Control ProtocolTLSTransport layer securityTUTransaction userUAUser agentUCSUniversal Character SetUDPUser Datagram ProtocolURIUniform resource identierUTF-8UCS Transformation Format, an 8-bit

    lossless encoding of Unicode characters

    88 Bell Labs Technical Journal

    (iSIP user agent resource framework), over the siptrans

    framework. (iSIP is a proxy from which the siptrans

    layer was extracted.) iSURF provides a low-level API

    with congurable, higher-level routines (plug-ins) to

    handle common processing, such as authentication.

    Although iSURF runs over siptrans, it can be modied

    to run over similar frameworks. Likewise, other higher-

    level frameworks can run over siptrans.

    The rest of this paper describes the iSURF archi-

    tecture and API, illustrates the API with a sample ap-

    plication, and describes our implementation including


    iSURF ArchitectureFigure 1 depicts the reference SIP stack as out-

    lined in RFC 3261 [4]. It also shows which layers of

    the reference stack are realized by the two frame-

    works we discuss in this paper. Note that, as is the

    case with reference models, the cleanly separated lay-

    ers often have interdependencies when realized in

    software. For example, even though the reference

    model depicts the lowest layer as a syntax/encoding

    layer, it is obvious that in order for a SIP request to get

    to it, it must have visited the transport layer. That is,

    the transport layer must have allocated appropriate

    resources (e.g., socket descriptors, internal buffers

    associated with these sockets) and recognized the

    boundaries of the SIP message before sending it to

    the syntax/encoding layer.

    Working from the bottom, the rst layer is the

    syntax and encoding layer. SIP is an American

    Standard Code for Information Interchange (ASCII)

    protocol; its syntax is defined by the augmented

    Transaction user


    Transport handling


    Statelessproxy UAS UAC Redirect Registrar

    Transaction-/call-stateful proxy B2BUA



    B2BUABack-to-back user agentiSIPLucent's SIP proxyiSURFiSIP user agent resource framework

    UACUser agent clientUASUser agent server

    Figure 1.A SIP stack hierarchy.

  • Bell Labs Technical Journal 89

    Backus-Naur form (ABNF) in [4]. Unlike traditional

    telephony signaling protocols that are encoded and

    transported in abstract syntax notation, no special

    encoding/decoding software is required for SIP. A SIP

    stack normally provides routines for parsing the

    UTF-8 encoded ASCII stream into internal data struc-

    tures for consumption by the layers above.

    The transport handling layer denes how clients

    and servers behave with respect to the multiple trans-

    ports supported by SIP. Unlike the Hypertext Transfer

    Protocol (HTTP), which runs over Transmission

    Control Protocol (TCP) and transport layer security

    (TLS)-over-TCP, SIP has been dened to work over

    TCP, TLS-over-TCP, User Datagram Protocol (UDP),

    and Stream Control Transport Protocol. Transport plu-

    rality is thus an additional facet of a SIP stack. The

    transport layer isolates its idiosyncrasies from the

    upper layers. For instance, the transport layer will

    choose to transmit a SIP message over a congestion-

    controlled transport, such as TCP, if it determines that

    the message is so large that sending it over UDP will

    fragment it. The transport layer will place only a sin-

    gle SIP message into a UDP datagram, as specied by

    the standard.

    The next layer, the transaction layer, is present in

    all SIP entities except stateless proxies. Transactions

    are a fundamental concept in SIP; a transaction is a

    request sent by a client, an optional (and unlimited

    number of) provisional responses followed by one or

    more nal responses. The ACK request is considered

    part of the INVITE transaction if the nal response is

    a non-2xx response; otherwise, it is considered a sep-

    arate transaction. The transaction layer is also re-

    sponsible for matching responses to appropriate

    requests and for scheduling the retransmission of

    requests and responses over unreliable transports.

    Siptrans implements the transaction layer and

    transport layer. iSURF is built over siptrans and im-

    plements the transaction user (TU) layer. As Figure 2depicts, iSURF partitions this layer into two pieces:

    the iSURF core and the application. The application

    builds and sends outgoing messages. The application

    can delegate plug-ins to process some of these

    incoming messages. The iSURF core noties the ap-

    plications event handlers of incoming messages as

    well as other events. The iSURF core, which main-

    tains states for calls and dialogs called contexts, is de-

    scribed later. The iSURF API helps the application

    build messages in call or dialog contexts. The rest of

    this section describes this architecture in more detail.

    Transaction/Transport LayersThe transport layer works as described above and

    is responsible for sending and receiving messages over

    the network. The transaction layer manages SIP trans-

    actions, which consist of a SIP request and the result-

    ing responses. To this end, it maintains a state

    machine whose inputs are incoming SIP messages,

    outgoing SIP messages, and timeout indications. In

    addition to maintaining state, the transaction layer

    does the following:

    Retransmits outgoing requests until the receipt of

    a response or until a timeout.

    Handles duplicate incoming requests by sending

    cached responses.

    Passes messages from the TU to the transport


    Passes messages from the transport layer to the


    Generates ACKs for incoming non-2xx class

    responses to INVITE transactions.

    Rejects badly formed incoming messages.

    Passes validated incoming messages to the TU.

    The TU LayerThe TU layer is responsible for responding to valid

    incoming messages, rejecting invalid messages, and

    generating fresh requests. To do this properly, this

    layer must maintain dialog state and other states

    relating to message payloads. As noted above, these

    responsibilities are shared between the application

    and the user agent (UA) core. The cores responsibil-

    ities are:

    Maintaining dialog and call state.

    Notifying the application of various timeouts and

    possibly responding to a timeout autonomously.

    For example, if directed by the application, iSURF

    will retransmit successful responses to INVITEs

    until the arrival of an ACK.

    Processing incoming messages delegated to iSURF

    through registered plug-ins.

  • 90 Bell Labs Technical Journal

    The applications responsibilities are:

    Handling incoming messages within event handlers

    that the application registers with the iSURF core.

    Delegating all or some of the processing of in-

    coming messages to plug-ins. The application pro-

    visions these plug-ins as well. For example, the

    application can register a plug-in to handle in-

    coming OPTIONS requests.

    Originating outgoing requests and responses.

    Maintaining state, other than the state main-

    tained by the framework. For example, the ap-

    plication maintains media state.

    iSURF APIWhile there is a place for a higher-level inter-

    face that abstracts away the complications of SIP,

    such interfaces cannot be used to build an arbitrary

    SIP user agent. These interfaces are useful as a

    supplement to a lower-level interface. The iSURF

    API, like the SIP standard, is basically message

    based. iSURFs plug-in facility may be used to ele-

    vate the level of the iSURF API by providing stan-

    dard routines that the application registers and


    The iSURF API provides the following facilities

    (the important facilities are discussed in more detail

    later in this section):

    An object-oriented message representation in-

    cluding constructors for building messages based

    on other messages or contexts, e.g., building a

    response from a request.

    Registration of event handlers to handle notica-

    tions of incoming messages, timeouts, and other


    Application main


    Event handlers

    Plug-ins(Standard event handlers)


    Transaction/transport layer



    Outgoingmessages Events including

    incoming messages

    Handlemessage ifplug-in is active

    Update contextwith message

    Incoming messagesand other events



    Application layer

    iSURF core layer TULayer

    iSIPLucent's SIP proxy iSURFiSIP user agent resource framework TUTransaction user

    Figure 2.iSURF architecture.

  • Bell Labs Technical Journal 91

    Methods to send a message or repeatedly send

    a message at intervals, e.g., retransmitting an

    INVITE response or refreshing a REGISTER.

    An interface to configure the iSURF runtime


    Application controlled timers.

    A mechanism for the application to store and re-

    trieve data from messages or contexts. We show

    how this facility is used in the section An iSURF

    API Example.

    Registration of plug-ins and support for registered


    Events and Event HandlersThe iSURF runtime environment is basically

    event driven. The application initializes the frame-

    work and then optionally sends one or more SIP re-

    quests. Subsequent incoming messages and other

    events are then delivered to the applications event

    handlers. There are several categories of events: in-

    coming messages, various timeouts, notications of

    plug-in activity, notications of context creation and

    destruction, and notications of subscription creation

    and destruction. Every SIP message that shares the

    same Call-ID header is part of the same call context.

    Roughly, messages in the same call context that also

    share From and To headers are part of the same dialog

    context. (Contexts are discussed in a later section.) The

    application can infer context creation and context de-

    struction from the incoming and outgoing messages.

    However, the rules for context creation and destruc-

    tion are quite complicated. Notifying the application

    of these events can simplify the application.

    Every iSURF application registers a single global

    event handler. In addition, iSURF allows context-

    scoped event handlers, i.e., handlers for messages in

    the same context. We have not seen mention of

    context-scoped event handlers in the JAIN [2] or

    VOCAL [7] application frameworks. This facility can

    better organize an application when message treatment

    is a function of message context, as is often the case. It

    is the case with third-party call control as illustrated in

    the An iSURF API Example section.

    If a context-scoped event handler is active for an

    incoming message, that handler is invoked for the

    message rather than the global event handler. If a

    message can be handled by both a dialog-scoped and

    a call-scoped event handler, only the dialog-scoped

    event handler is invoked.

    Message RepresentationiSURF presents an object-oriented representation

    of SIP messages and message components that models

    the RFC 3261 grammar. The application never directly

    accesses object data but uses member functions to

    manipulate the data. An application never needs to

    explicitly free these objects as described below. The

    iSURF message constructors simplify building mes-

    sages. When a message is built from scratch, con-

    structor arguments contain some of the mandatory

    message headers for the message. The other manda-

    tory headers are derived. For example, a constructor

    for an OPTIONS request takes a source and destination

    uniform resource identier (URI) as arguments and

    uses these arguments to build the Request-URI, From,

    and To headers. The constructor also lls in the Via,

    Max-Forwards, Call-ID, and CSeq header elds. When

    the message is not built from scratch, the message

    headers are inherited from another message or con-

    text. For example, a response may be constructed

    from a request, in which case the From header, To

    header, CSeq header, Via headers, and Call-ID headers

    are copied from the request to the response.

    When an application sends or receives a message,

    iSURF only checks those elements of the message that

    relate to maintaining context information. iSURF does

    not run other checks for performance reasons. For

    example, iSURF checks the number in the CSeq

    header of messages in dialogs, but does not check to

    ensure that the Event header is not present in an

    INVITE request.

    Every SIP message is represented as an instance of

    the Message class. This class has two subclasses:Request and Response. Furthermore, for each re-quest method there is a subclass of Request suchas Invite, Ack, or Options. A request whosemethod is unknown to the implementation is sim-

    ply represented as instances of the base Request class.The Message class is a subclass of the Memory

    Wrapper class, which handles the memory manage-ment for instances of Message. This mechanism isdescribed later. The Message class contains methods

  • 92 Bell Labs Technical Journal

    for manipulating the headers and other components

    of a message.

    For each known header type there is an iSURF

    class such as FromHeader and ToHeader. iSURFprovides methods to manipulate the various header

    components. Header types unknown to the imple-

    mentation are represented by the Header class,which represents such headers simply as a header

    name and header value.

    iSURF represents message content and the asso-

    ciated content-related headers as the Payload classand so deviates from the standard SIP representation

    in which the content-related headers are not grouped

    separately. With this scheme, the Content-Length

    header is never specified but is derived from the

    length of the string representation of the payload and,

    therefore, there can never be a discrepancy between

    the actual payload length and length stated in

    Content-Length. The Payload class represents theactual payload as a string. Subclasses of Payloadprovide richer representation than string for Session

    Description Protocol (SDP) and for multi-part MIME

    attachments. A multi-part MIME attachment is simply

    represented as a list of Payloads.

    Message Resending MechanismsiSURF provides a mechanism to periodically re-

    send the same message. For example, iSURF can

    repeatedly resend a 2xx INVITE response at appro-

    priate (exponentially decaying) intervals until the

    response has been acknowledged. Similarly, iSURF

    can periodically send REGISTER requests or

    SUBSCRIBE renewals.

    State/ContextsiSURF maintains dialog and call context state. RFC

    3261 describes dialog state in detail. Dialogs are estab-

    lished after a provisional response or a successful nal

    response to an INVITE request. Extensions to the

    standard describe other means of establishing a dialog.

    For example, a successful response to a SUBSCRIBE

    request establishes a dialog. Once a dialog has been

    established, iSURF updates dialog state as additional

    messages in the dialog are encountered. Certain

    messages terminate a dialog. For example, a dialog that

    has been established by an INVITE may be terminated

    by a BYE, or certain responses to re-INVITES or to re-

    SUBSCRIBES in a dialog will terminate the dialog.

    Dialogs that have not been conrmed by a nal

    response are terminated by a subsequent failure


    The SIP standard does not mention call context

    explicitly. However, in iSURF, a call context is estab-

    lished on processing a request with a new Call-ID

    value. Every message with this Call-ID is in the same

    call context. If a response to that request establishes a

    dialog, the call context terminates when every dialog

    associated with the request terminates; otherwise, the

    call context terminates when the transaction associ-

    ated with the request terminates.

    Applications can use contexts in several ways.

    iSURF simplies building messages in dialog contexts

    and, as explained in the Events and Event Handler

    section, the application can use context-scoped event

    handlers to improve its organization. As another con-

    venience, the application can store application data

    in a context and retrieve the data when processing a

    message in the context. Finally, the application can

    access the context state itself. For example, the appli-

    cation can access the list of active dialogs established

    with a call.

    As mentioned previously, for performance rea-

    sons, iSURF only checks that incoming or outgoing

    messages do not violate state constraints such as:

    A response must match a request.

    Requests in a dialog must be in order.

    It is invalid to send a re-INVITE before receiving

    conrmation of the previous INVITE.

    The application is responsible for all other mes-

    sage validation including the validation described in

    Section 8 of RFC 3261. The application can delegate

    some of this responsibility by conguring and regis-

    tering plug-ins to validate messages.

    Plug-insPlug-ins are iSURF procedures that are cong-

    ured and registered by the application to partially or

    completely handle incoming messages. As with inco-

    ming event handlers, plug-ins may be registered

    globally or in the scope of a dialog or call. We now

    describe the types of plug-ins and their order of

  • Bell Labs Technical Journal 93

    execution. Multiple plug-ins may be executed for a

    single message. The rst plug-in that can be executed

    for an incoming message checks the message syntax

    using the procedures described in Section 8 of RFC

    3261. The application congures this plug-in by in-

    dicating the methods it recognizes, indicating the

    headers it recognizes, and so forth. If this plug-in re-

    jects a message, iSURF invokes the applications

    plug-in rejects event handler and none of the ap-

    plications incoming message handlers are invoked

    for this message.

    Otherwise, if the message is a request and if an

    authentication verication plug-in is active, it is in-

    voked. The application congures this plug-in with

    a policy that indicates which requests to challenge. If

    this plug-in generates an authorization challenge

    request, iSURF invokes the applications plug-in chal-

    lenges request event handler and none of the appli-

    cations incoming message handlers is invoked.

    Otherwise, at most one plug-in is now invoked

    for the message. These nal plug-ins are specied for

    requests by the request method or specied for re-

    sponses by the response code. For example, there is a

    plug-in to handle OPTIONS requests and a plug-in to

    handle 301-Moved Permanently responses. If a plug-

    in is invoked at this time, then none of the applica-

    tions incoming event handlers are executed for the

    message. After the plug-in executes, the application is

    notied by either the plug-in rejects message or the

    plug-in handles message event.

    Memory ManagementiSURF seeks to reduce the onerous task of mem-

    ory management by automatically managing the

    memory for the following objects: messages, message

    components, and contexts. All of these objects are

    represented as subclasses of the MemoryWrapperclass, which contains a pointer to the data for these

    objects including a reference count. This single pointer

    is the only data member in the various subclasses.

    Copy constructors, regular constructors, assignment

    operators, and destructors for MemoryWrapper aredesigned to implement a reference counting scheme

    as well as copy-by-reference. Constructing an instance

    of one of these subclasses results in allocating data for

    the instance, storing a pointer to the allocated data in

    MemoryWrapper, and setting the reference in theallocated data to one. Copying a MemoryWrapperinstance is implemented by a pointer copy and an

    increment of the reference count. Destroying an

    instance decreases the reference count for the in-

    stance. Memory is freed when the reference count

    goes to zero. Strings or lists are represented using the

    C++ standard template library, which automatically

    manages the memory for these types.

    An iSURF API ExampleWe describe a sample application that demon-

    strates context-scoped event handlers and building

    messages with the iSURF API. The example application

    is a server that receives requests to set up SIP sessions

    between two parties and, thus, implements third-party

    call control. For each request, the application sends

    an INVITE without a session descriptor to user A. User

    As response contains a session descriptor, offering one

    or more choices for the media protocol for the session.

    The application does not ACK user As response at this

    time but sends an INVITE to user B with a copy of

    user As session descriptor. User Bs subsequent re-

    sponse contains a session descriptor that answers the

    offer from the INVITE. The application then ACKs user

    Bs response and ACKs user As earlier response. The

    application attaches the session descriptor from user

    Bs response to the ACK sent to user A. Media now

    ows between the two parties. When either party

    wants to end the call, it sends a BYE to the application.

    After responding to the BYE, the application sends a

    BYE to the other endpoint. Alternatively, the applica-

    tion can terminate the call. We illustrate the call ow

    for our application in Figure 3. The applicationterminates the call in this illustration.

    The application models a call with the Call class.The Call constructor sends out an INVITE toparty A and registers an instance of ALegHandlerfor this call leg. Upon receipt of a 200-OK response,

    AEventHandler :: handleIncomingResponsesends an INVITE to party B and registers an in-

    stance of BLegHandler for the call leg betweenthe application and party B. On receipt of a 200-OK

    response from party B, BlegHandler :: handleIncomingResponse acknowledges this response

  • 94 Bell Labs Technical Journal

    and then acknowledges party As previous response.

    Both event handlers contain pointers to a Callinstance and contain functions to handle BYE requests

    and responses. Panel 2 contains most of the applica-tion code. It does not show the BYE handling code

    and the code for failure responses.

    ImplementationThe iSURF implementation is based on siptrans,

    which we discuss next. The current iSURF imple-

    mentation has the standard methods (INVITE, ACK,

    BYE, REGISTER, OPTIONS, and CANCEL) as well as

    SUBSCRIBE and NOTIFY. iSURF runs on the Solaris,*

    Linux,* Windows,* and WinCE platforms. After the

    siptrans discussion, we discuss how iSURF is built over

    siptrans, iSURFs performance overhead, and how to

    extend iSURF with additional methods.

    SiptransSiptrans is the transaction processing engine

    for iSURF. Every SIP entity, with the exception of a

    stateless proxy, requires a transaction manager. The

    siptrans framework provides such a transaction man-

    ager in the form of a multithreaded C library suitable

    for being linked in with higher-layer applications (like

    TUs). With reference to Figure 1, siptrans provides

    the capabilities of the bottom three layers.

    Siptrans provides an API consisting of eight meth-

    ods and three interfaces that allows TUs to use it for

    the services it offers. These interfaces and methods

    are reproduced in Table I.In SIP, proxies are pure transaction processing

    entities; thus, it appears reasonable that a transaction

    processing library be culled from the internals of a SIP

    proxy. Siptrans is refactored from a commercial grade,

    User A Application User B

    Invite (payload0)

    200-OK Response (payloadA)

    ACK (payloadB)

    ACK (payload0)

    200-OK Response (payloadB)

    Invite (payloadA)




    200-OK Response

    200-OK Response

    Figure 3.Call ow for iSURF API example.

  • Bell Labs Technical Journal 95

    Panel 2. iSURF example

    struct Call;class ALegHandler : public ContextEventHandler {

    Call *call;public:

    ALegHandler(Call *call) { this->call = call; }void incomingInviteResponse(const Response resp);void incomingByeRequest(const Bye);

    };class BLegHandler : public ContextEventHandler {

    Call *call;public:

    BLegHandler(Call *call) { this->call = call; }void incomingInviteResponse(const Response resp);void incomingByeRequest(const Bye);

    };struct Call {

    Response aLegResponse;string aUser; string bUser;string aHost; string bHost;ALegHandler a;BLegHandler b;void sendInvite(string fromUser, string toUser, string host, Payload p,

    ContextEventHandler *handler) {

    SipAddress sender(SipStack::getMyURI(fromUser));SipAddress receiver(SipURI(toUser, host));FromHeader from(sender);ToHeader to(receiver);StandardContactHeader contact(sender);Invite invite(to, from, contact, p);SipStack::sendMessage(invite);CallContext call(invite);call.setEventHandler(handler);

    }Call(string auser, string ahost, string buser, string bhost) : a(this),

    b(this){aUser = auser; bUser = bUser; aHost = ahost; bHost = bHost;//Sends an Invite to party A, sets "a" as event handler for callsendInvite(bUser, aUser, aHost, Payload(), &a);

    }};void ALegHandler::incomingInviteResponse(const Response resp) {

    if (resp.getResponseCode() < 200 && resp.getResponseCode() >= 300) return;call->aLegResponse = resp;//Send an Invite to party B, sets "b" as event hander for this callcall->sendInvite(call->aUser, call->bUser, call->bHost, resp.getPayload(),


    //Ack the response from A, Ack the previous response from B Attach A's//payload to this Ackvoid BLegHandler::incomingInviteResponse(const Response resp) {

    if (resp.getResponseCode() < 200 && resp.getResponseCode() >= 300) return;Ack ack(resp);SipStack::sendMessage(ack);ack = Ack(call->aLegResponse);ack.setPayload(resp.getPayload());SipStack::sendMessage(ack);


  • 96 Bell Labs Technical Journal

    Table I. Siptrans interfaces.

    Interface orName Method Description

    siptrans_init_core Method Initializes the siptrans complex

    siptrans_version_get Method Returns the version of the siptrans library

    siptrans_log_set_level Method Sets the logging level for the siptrans library

    siptrans_log Interface Implements the logging capabilities specic to the TU

    siptrans_sip_message_to_TU Interface Invoked by siptrans library to send a SIP messageto the TU core.

    siptrans_dispatch_sip_message Method Called by the TU to send a SIP message (requestor response) to siptrans for further delivery.

    siptrans_schedule_alarm Method Called by TU to schedule an alarm.

    siptrans_alarm_to_TU Interface Invoked by the siptrans library when an alarmset by the TU needs to be red.

    siptrans_start Method Called by TU to start the siptrans library

    siptrans_wait Method Called by the TU to keep the calling thread active

    siptrans_stop Method Called by TU to stop the siptrans library

    SIPSession Initiation ProtocolTUTransaction user

    RFC 3261-compliant [4] proxy developed by Lucent

    Technologies. The proxy is resident on the PacketIN

    application hosting environment [1]. During its de-

    velopment cycle and as part of an ongoing effort to

    be compliant to the SIP specication, the proxy has

    participated in six SIP interoperability events. Thus, it

    was a design goal of siptrans to inherit, as much as

    possible, the tested functionality of the proxy.

    Siptrans alleviates much of the drudgery associ-

    ated with parsing, transport management, and trans-

    action handling. It also provides a general purpose

    alarming facility for the TU. Once initialized, it ac-

    cepts requests or responses either from its controlling

    TU or from the network. Further handling of the SIP

    request (or response) depends on who initiated it, the

    controlling TU or a peer across the network. This is

    described in the next section. Siptrans fully imple-

    ments the client state machines corresponding to the

    INVITE and non-INVITE requests (Figures 5 and 6,

    respectively, of RFC 3261 [4]) as well as the server

    state machines corresponding to the INVITE and non-

    INVITE requests (Figures 7 and 8, respectively, of RFC

    3261 [4]).

    Siptrans and request handling. A request can arriveat siptrans from one of two sources: a peer across

    the network or the TU. For example, siptrans relays a

    request from the network to the TU or siptrans relays

    a request from the TU to its intended destination on

    the network.

    Requests arriving from the network. If a request ar-

    rives at siptrans from the network, siptrans parses

    it and creates a transaction if none exists. The

    parsed request is subsequently passed to the TU

    through the Siptrans_sip_message_to_TUinterface. This is an interface that is provided by

    the TU programmer and invoked by siptrans on

    receiving a SIP request (or response). Siptrans does

    not make any assumption of the details of this in-

    terface; it only requires that the TU eventually in-

    vokes the Siptrans_dispatch_sip_message

  • Bell Labs Technical Journal 97

    method to pass a nal response corresponding to

    the request. The response will be added to the SIP

    transaction maintained by siptrans and transmitted

    out to the network.

    Requests arriving from the TU. Another source of

    incoming requests to siptrans is the TU. When

    the TU desires to send out a request, it creates

    one and presents it to siptrans by invoking the

    Siptrans_dispatch_sip_message method.Siptrans creates a transaction and analyzes the

    Request-URI or Route header (if present) to de-

    rive the network addresses that the request

    should be sent to. Siptrans supports, in part, the

    SIP procedures required to locate SIP servers [3]

    and can thus query domain name server (DNS)

    service (SRV) resource records (RR) for the TCP

    and UDP transports as well as DNS A RR queries

    (it does not yet support DNS naming authority

    pointer [NAPTR] RR queries). It builds a list of

    preferred servers if DNS SRV RR lookup is suc-

    cessful and starts sending the request to the

    servers until a server responds. If no server re-

    sponds, siptrans creates a local failure response

    and sends it to the TU, thus meeting the request-

    response expectations of the TU.

    Siptrans and response handling. Reliability andretransmission characteristics differ in SIP for INVITE

    and non-INVITE transactions. Non-2xx responses to

    INVITE are retransmitted by the transaction layer

    until an ACK is received (and absorbed by the trans-

    action layer); 2xx responses for the INVITE are re-

    transmitted by the TU and ACKs for such responses

    are passed on to the TU. Furthermore, responses to

    non-INVITE requests are not retransmitted at all, but

    are cached on the server side and retransmitted only

    upon the receipt of a duplicate request. For siptrans,

    a response can arrive from one of two sources: the

    TU or a peer across the network. A response origi-

    nated by the TU is sent out on the network by siptrans

    to its intended destination, whereas a response from

    the network is presented to the TU.

    Responses arriving from the network. Assuming a

    request from the TU was delivered to the appro-

    priate peer, any responses to the request are

    matched to the appropriate transaction by

    siptrans and forwarded to the TU by invoking the

    Siptrans_sip_message_to_TU interface.Handling of responses in SIP differs for INVITE

    and non-INVITE requests. First, retransmitted 2xx

    responses for the INVITE are passed to the TU,

    even if a transaction does not exist (as mandated

    by RFC 3261 [4]). Second, INVITE requests re-

    quire an additional method, ACK, to complete;

    whereas, non-INVITE requests terminate when a

    nal response is received. Following the appro-

    priate SIP state machines, if siptrans receives a

    non-2xx response for an INVITE, it will generate

    and send an ACK to the peer across the network

    and present the non-2xx response to the TU.

    Siptrans will not generate an ACK for a 2xx-class

    response to the INVITE. For all non-INVITE

    requests, siptrans will cache the response and

    present it to the TU. Retransmissions of the

    non-INVITE request will elicit the cached


    Responses arriving from the TU. The TU invokes the

    Siptrans_dispatch_sip_message methodto pass a response to siptrans. Siptrans simply

    follows the Via list in the response to send it on-

    ward (in SIP, responses are routed based on the

    Via list).

    Siptrans handles error conditions such as when

    the peer UA has closed a TCP connection and cannot

    receive a response on that connection. In such cases,

    siptrans attempts to open a new TCP connection to

    the peer UA in order to send the response. If a new

    connection is successfully established, siptrans will

    send the response; otherwise, there is precious little

    the transaction layer can do besides clean up the


    Siptrans and alarm. In addition to the transactionand transport handling features, siptrans also pro-

    vides a general purpose alarming facility to the TU.

    The TU can use this facility to schedule alarms for

    retransmitting 200-OK responses to an INVITE re-

    quest, for instance. However, the alarming facility is

    designed to be as generic as possible so that arbitrary

    alarms can be scheduled and executed. The APIs pro-

    vide for the TU to store an opaque data item in the

    alarm, which is returned to it when the alarm res.

  • 98 Bell Labs Technical Journal

    This allows the TU to save state in the alarm for

    reuse when the alarm fires. To schedule an alarm,

    the TU invokes the Siptrans_schedule_alarmmethod; siptrans, in turn, invokes a TU programmer

    supplied interface, Siptrans_alarm_to_TU, whenthe alarm res.

    Building iSURF Over Siptrans or Other FrameworksThere are a number of issues when building

    iSURF over a transaction framework:

    The transaction frameworks memory management

    scheme. iSURF must know when memory refer-

    ences from the transaction framework become in-

    valid. If iSURF needs to retain data obtained from

    the transaction framework after the data is freed,

    it must copy the data before it is freed. In the case

    of siptrans, iSURF was able to integrate the sip-

    trans reference counting scheme with its own.

    Transaction layer timeouts. iSURF delivers transaction

    layer timeouts to the application as 408-transac-

    tion timeout response messages. Siptrans cre-

    ates these response messages and passes these

    messages to the iSURF core, which in turn passes

    them to the application. If some other transaction

    framework notied iSURF of the timeout through

    another mechanism, then iSURF would have to

    build the timeout response messages itself.

    Layering issues. Although the standard states the

    responsibilities of the various SIP layers, imple-

    mentations are free to reassign these responsibil-

    ities. For example, siptrans is responsible for

    placing Vias in outgoing requests and siptrans

    determines the destination Internet Protocol

    address for outgoing requests. If some other

    transaction framework did not assume these

    responsibilities, then the iSURF core would have

    to assume them.

    Threading. iSURF needs to know the threading

    structure of the transaction framework to avoid

    deadlocks and to efciently make its code thread


    PerformanceWe investigated iSURFs performance by writing a

    client application that repeatedly initiates and termi-

    nates sessions at a server application. We implemented

    the client/server application pairs two ways: by using

    the iSURF API and by using the siptrans API. We then

    compared the performance of the two implementa-

    tions. The iSURF client application underperformed

    the siptrans client application by 10%. However, the

    iSURF server application outperformed the siptrans

    server application by 10%. We believe the iSURF

    server application outperformed the siptrans server

    application because the siptrans server copied head-

    ers from requests to responses. Because of iSURFs

    sophisticated memory management, it merely copied

    header references from requests to responses. There

    was less opportunity of exploiting this advantage in

    the client application. Regardless, iSURF does not im-

    pose a large overhead over its siptrans transaction


    Extending iSURFAdding another request type to iSURF involves

    the following:

    Adding a new class representing the request

    method. If a header must appear in a new request,

    then every constructor for the class should include

    this header as an argument. A constructor that

    allows the new request to be built in a dialog

    should also be added to the class.

    Adding a corresponding function to its event han-

    dler class if there are any timed events associated

    with the request. For example, when extending

    iSURF to handle SUBSCRIBE-NOTIFY, we de-

    clared an event handling function that is called

    when a SUBSCRIBE expires.

    Implementing a periodic refresh for the request

    if necessary. This was done for the SUBSCRIBE


    Adding classes for any new headers necessary for

    the request.

    Adjusting iSURFs dialog state maintenance rou-

    tines if a transaction involving the new request

    effects dialog state.

    Providing access routines to dialog state related

    to the new request. For example, iSURF

    provides the application access to details on

    active subscriptions created by the SUBSCRIBE


  • Bell Labs Technical Journal 99

    Determining whether to add timers that prevent a

    transaction or dialog from being stuck in the same

    state indenitely. We added such a timer, so that

    an INVITE transaction cannot be stuck in the

    proceeding state indenitely.

    Conclusions/Next StepsiSURF is a framework for building arbitrary user

    agents. The goal of the framework is to ease the effort

    of writing a user agent without surrendering general-

    ity. We believe we have succeeded in this effort.

    Moreover, we believe the performance penalty of

    achieving this ease of effort is manageable.

    At the basic message passing level, the message

    representation is easy to learn, as it largely follows

    the SIP grammar. The message constructors guide

    the application writer in building correct messages.

    Moreover, iSURF relieves the application of memory

    management duty. iSURF provides optional higher-

    level constructs that facilitate application building:

    iSURF helps to dispatch incoming messages to

    context-based event handlers.

    iSURF allows applications to store application data

    in contexts or messages.

    iSURF provides message resending/refresh


    iSURF has congurable plug-ins that handle many

    phases of processing incoming messages.

    iSURF is constructed using siptrans, a SIP transac-

    tion library. The library has been designed to be com-

    patible with any TU that implements the interfaces

    detailed in Table I. iSURF is one such TU; we have also

    been successful in using siptrans in another unrelated

    project within Lucent. Siptrans is portable across Solaris

    and Linux platforms. It has also been successfully ported

    to the pSOS* real-time operating system and WinCE.

    Work continues on both iSURF and siptrans. We

    are investigating increasing iSURFs performance by

    speeding up siptrans itself and by better integrating the

    iSURF and siptrans memory management. We are also

    investigating storing iSURFs core state persistently to

    help application writers build reliable user agents.

    *TrademarksJAIN and Java are trademarks and Solaris is a registered

    trademark of Sun Microsystems, Inc.

    Linux is a registered trademark of Linus Torvalds.

    pSOS is a trademark of Wind River Systems, Inc.

    Windows is a registered trademark of MicrosoftCorporation.

    References[1] Y. Chen, O. B. Clarisse, P. Collet, M. A.

    Hartman, L. Rodriguez, L. Velazquez, and B. A.Westergren, Web Communication Servicesand the PacketIN Application HostingEnvironment, Bell Labs Tech. J., 7:1 (2002),2540.

    [2] P. ODoherty and M. Ranganathan, JSR 32:Jain SIP API Specication, Aug. 2003,.

    [3] J. Rosenberg and H. Schulzrinne, SessionInitiation Protocol (SIP): Locating SIPServers, IETF RFC 3263, June 2002,.

    [4] J. Rosenberg, H. Schulzrinne, G. Camarillo,A. Johnston, J. Peterson, R. Sparks, M.Handley, and E. Schooler, SIP: SessionInitiation Protocol, IETF RFC 3261, June 2002,.

    [5] K. Singh, J. Lennox, S. Narayanan, and H.Schultzrinne, CINEMA: Columbia InterNetExtensible Multimedia Architecture, ColumbiaUniversity, NY, Nov. 2002, .

    [6] D. Tweedie, JSR 125: JAIN SIP Lite, July2002, .

    [7] Vovida, Vovida Open CommunicationsApplication Library (VOCAL), Apr. 2003,.

    (Manuscript approved May 2004)

    ROBERT M. ARLEIN is a member of technical staff in theServices Infrastructure Research Departmentat Lucent Technologies in Murray Hill, NewJersey. He holds B.S. and A.M. degrees inmathematics from the University ofWisconsin in Madison and an M.S. degree in

    computer science from New York University in NewYork City. He is currently investigating the uses of SIP-based components in a converged service network.Mr. Arlein holds one patent and has one applicationpending.

  • 100 Bell Labs Technical Journal

    VIJAY K. GURBANI is a distinguished member oftechnical staff in the Wireless NextGeneration Architecture and EvolutionDepartment at Lucent Technologies inNaperville, Illinois. He holds B.Sc. and M.Sc.degrees in computer science from Bradley

    University in Peoria, Illinois. He is a Ph.D. candidate incomputer science at the Illinois Institute of Technologyin Chicago. Mr. Gurbani is currently involved in thespecication, prototyping, and implementation ofservices based on SIP. His research interests are Internettelephony services, Internet signaling protocols,pervasive computing in the telecommunicationsdomain, distributed systems programming, andprogramming languages. He is a member of the ACMand the IEEE Computer Society. He holds one patentand has four applications pending.


View more >