VoIP Research Paper

RESEARCH PAPER ON VoIP SECURITY – INTERNET TELEPHONY EETS 8313

A Research PaperOn

VoIP SECURITY

Submitted By

ASHISH PANDE

SHREYASH SAWANT

SWARA DAVE

ASHWATH BALAKRISHNAN

Under The Guidance Of

PROF. SCOTT KINGSLEY

DEPARTMENT OF

TELECOMMUNICATION & NETWORK ENGINEERING

LYLE SCHOOL OF ENGINEERING

DALLAS, TX

SOUTHERN METHODIST UNIVERSITY

2015

Page | 1


Abstract

In VoIP technology, the voice signal is first separated into frames, which are then stored in data

packets, and finally transported over IP network using voice communication protocol. Currently,

most VoIP systems use either one of two standards; H.323 or the Session Initiation Protocol

(SIP). This paper discusses the two most important protocols for signaling in voice over IP

(VoIP) networks of today, namely SIP and H.323. We have tried to focus on those aspects that

are most important for successful operation of VoIP networks, ranging from characteristics of

syntax to security threats occurred due to it. Security features and some implementation issues

are analyzed as well. Inference is that although H.323 appeared earlier, SIP is better adapted to

the Internet environment which is the most important reason for its success. This paper also

defines new functionality for negotiating the security mechanisms used between a Session

Initiation Protocol (SIP) user agent and its next-hop SIP entity. This new functionality

supplements the existing methods of choosing security mechanisms between SIP entities. We

have also discussed about H.323 protocol and the security issues concerned with it.

Page | 2


Table of Contents

1. Introduction ...............................................................................................................1

2. Overview of VoIP ....................................................................................................2

2.1. VoIP Equipment ...................................................................................................2

2.2. Quality of Service Implications for Security................................................................3

2.3. VoIP Protocols..........................................................................................................3

3. SIP - Session Initiation Protocol........................................................................................3

3.1. SIP Overview.........................................................................................4

3.2. Design Goals.............................................................................................................4

3.3. Solutions....................................................................................................................4

3.4. Security Mechanisms for VoIP..................................................................................5

3.4.1. Diffie-Hellman Key Exchange...........................................................5

3.4.2. TLS (Transport Layer Security) ......................................................................6

3.4.3. IP-IKE (Internet Protocol-Internet Key Exchange)...........................................7

3.5 Syntax.....................................................................................................7

3.6. Protocol Operation ....................................................................................................9

3.6.1. Client Initiated.............................................................................9

3.6.2. Server Initiated...............................................................9

3.6.3. Security Mechanism Initiation and Duration of Association...........................13

3.7. Security Consideration ................................................................................................13

4. H.323 – The International Standard................................................................................15

4.1. H.225.0 Call Signalling......................................................................................15

4.2. Security Issues for H.323.......................................................................................17

Page | 3


5. Media Gateway Control Protocol and its Security Issues.............................................18

6. Encryption and IPsec……………………………………………………………..

6.1. IPsec…………………………………………………………………………….

6.2. The Role of IPsec in VoIP………………………………………………………

6.3. Difficulties Arising from VoIPsec……………………………………………….

6.4. Encryption/Decryption Latency…………………………………………………..

6.5. Scheduling and the Lack of QoS in the Crypto-Engine……………………………

6.6. Expanded Packet Size…………………………………………………….

6.7. IPsec and NAT Incompatibility…………………………………………..

7. Solutions to the VoIPsec Issues………………………………………………….

7.1. Encryption at the End Points……………………………………………….

7.2. Secure Real-Time Protocol (SRTP)…………………………………………

7.3. Better Scheduling Schemes……………………………………………………

7.4. Compression of Packet Size……………………………………………………

7.5. Resolving NAT/IPsec Incompatibilities……………………………………….

8. Conclusion………………………………………………………….

REFERENCES AND BIBLIOGRAPHY…………………………………………

Index of Figures

Figure 1: VoIP System………………………………………………………

Figure 2: SIP Network Model……………………………………………

Figure 3: Security Agreement Message Flow........................................................................

Figure 4: Diffie-Hellman Key Exchange Algorithm.................................................................

Figure 5: H.323 Protocol Hierarchy..........................................................................

Page | 4


1. INTRODUCTION

Voice over Internet Protocol (VoIP) refers to the transmission of speech across data-style

networks. This form of transmission is conceptually superior to conventional circuit switched

communication in many ways. However, a plethora of security issues are associated with still-

evolving VoIP technology. This paper introduces VoIP, its security challenges, and potential

countermeasures for VoIP vulnerabilities.

2. OVERVIEW OF VoIP

Many readers who have a good understanding of the Internet and data communications

technology may have little background in transmitting voice or real-time imaging in a packet-

switched environment. One of the main sources of confusion for those new to VoIP is the

(natural) assumption that because digitized voice travels in packets just like other data, existing

network architectures and tools can be used without change for voice transmission. VoIP adds a

number of complications to existing network technology, and these problems are compounded

by security considerations. Most of this report is focused on how to overcome the complications

introduced by security requirements for VoIP.

2.1 VoIP EQUIPMENT

In general, though, the term Voice over IP is associated with equipment that provides the ability

to dial telephone numbers and communicate with parties on the other end of a connection who

have either another VoIP system or a traditional analog telephone. Demand for VoIP services has

resulted in a broad array of products, including Traditional telephone handset, Conferencing

units, mobile units, PC or Softphone.

In addition to end-user equipment, VoIP systems include a large number of other components,

including call processors (call managers), gateways, routers, firewalls, and protocols. Most of

Page | 5


these components have counterparts used in data networks, but the performance demands of

VoIP mean that ordinary network software and hardware must be supplemented with special

VoIP components. The unique nature of VoIP services has a significant impact on security

considerations for these networks, as will be detailed in later chapters.

Figure 1: VoIP System

2.2 QUALITY OF SERVICE IMPLICATIONS FOR SECURITY:

Page | 6


The strict performance requirements of VoIP have significant implications for security,

particularly denial of service (DoS) issues. VoIP-specific attacks (i.e., floods of specially crafted

SIP messages) may result in DoS for many VoIP-aware devices. For example, SIP phone

endpoints may freeze and crash when attempting to process a high rate of packet traffic SIP

proxy servers also may experience failure and intermittent log discrepancies with a VoIP-specific

signaling attack of under 1Mb/sec. In general, the packet rate of the attack may have more

impact than the bandwidth; i.e., a high packet rate may result in a denial of service even if the

bandwidth consumed is low.

2.3 VoIP PROTOCOLS

VoIP was introduced in 1995 and it is still evolving today. The most widely used VoIP protocols

are as follows:

SIP – It was developed by 3Com as an alternative to H.323.

H.323 – It was developed by ITU (International Telecommunications Union) and the

IETF (Internet Engineering Task Force).

MGCP – It was developed by CISCO as an alternative to H.323.

3. SIP - SESSION INITIATION PROTOCOL

3.1 SIP OVERVIEW

The Session Initiation Protocol (SIP) is an application-layer control protocol that can establish,

modify and terminate multimedia sessions or calls. These multimedia sessions include

multimedia conferences, distance learning, Internet telephony and similar applications. SIP can

be used to initiate sessions as well as invite members to sessions that have been advertised and

established by other means. Sessions can be advertised using multicast protocols such as SAP,

electronic mail, news groups, web pages or directories (LDAP), among others. Callers and

Page | 7


callees are identified by SIP addresses. When making a SIP call, a caller first locates the

appropriate server and then sends a SIP request. The most common SIP operation is the

invitation. Instead of directly reaching the intended callee, a SIP request may be redirected or

may trigger a chain of new SIP requests by proxies. Users can register their locations with SIP

servers.

Figure 2: SIP Network Model

The developments of security hazards and the different methods to counter them are on the raise

very often so in this trend of changing security mechanisms it is important to make a negotiation

of existing mechanisms and fixing on one particular method to follow. The purpose of this article

Page | 8


is to define negotiation functionality for the Session initiation protocol. This negotiation is

targeted only between the UA and its first-hop SIP entity.

Without proper security method in place the SIP connection is vulnerable to certain heinous

attacks. The authenticity and integrity of a SIP based connection can only be considered

acceptable only if the end user is assured of a proper security mechanism that can prevent the

connection from any man-in–the-middle attacks. It may at times be impossible to know if a

certain security mechanism is truly unavailable or if it is the MIM attack that causes the refusal

of the security mechanism.

The possible VoIP threats are could be any of the one that is represented below.

(i) Misrepresentation of identity authority rights and content.

(ii) Interception and Modification of content. Example (call black holing, call

rerouting, Fax alteration, Conversation Impersonation and Hijacking)

(iii) Service Abuse like VoIP specific (DoS), request flooding, Malformed requests

and messages, QoS abuse etc.

3.2 DESIGN GOALS

1. The nodes involved in the communication links’s security agreement process need to find out

exactly which is the best security mechanisms to apply, preferably without excessive additional

roundtrips.

2. The selection of security mechanisms itself needs to be absolutely secure. All security

protocols use a secure form of negotiation. For instance, after establishing mutual keys through

Diffie-Hellman, Inter Key Exchange sends hashes of the previously sent data including the

Page | 9


offered crypto mechanisms. This allows the peers to detect if the initial, unprotected offers were

easy to tamper with.

3. The entities involved in the security agreement process need to be capable of indicating

success or failure of the security agreement process.

4. The security agreement process should not introduce any additional state to be maintained by

the involved entities.

3.3 SOLUTIONS

Steps involved in the Operation -

Figure 3: Security agreement message flow

Step 1: Clients that are wishing to use this specification can send a list of their supported security

mechanisms along the first request to the server.

Step 2: Servers that are wishing to use this specification can challenge the client to perform the

security agreement procedure. The security mechanisms and parameters supported by the server

are sent along in this challenge.

Step 3: The client then proceeds to select the highest-preference security mechanism they have

in common and to turn on the selected security.

Page | 10


Step 4: The client contacts the server again, now using the selected security mechanism. The

server’s list of supported security mechanisms is returned as a response to the challenge.

Step 5: The server verifies its own list of security mechanisms in order to ensure that the original

list had not been modified.

3.4 SECURITY MECHANISMS FOR VOIP

The popular security mechanisms that are most profoundly used are Diffie-Hellman, TLS and

IPsec-IKE for VoIP security. These are discussed in detail in the sections below.

3.4.1 DIFFIE-HELLMAN KEY EXCHANGE

This is one of the most popular methods used in internet secure transfers. It involves two

publicly known numbers, a prime number ‘g’ and an integer ‘α’ that is relatively prime to ‘g’.

For instance, if User A wants to exchange a key with B then:

User A selects a random integer XA < g

Computes YA = α mod g

User B selects a random integer XB < g

Computes Yb = α mod g

Each side keeps the ‘X’ value private, Makes Y value publicly available to the other side.

User ‘A’ computes the key as K = (YB) mod g

User ‘B’ computes the key as K = (YA) mod g

Page | 11


Figure 4 : Diffie-Hellman Key Exchange Algorithm

The result shows that two sides have exchanged the secret value. Thus the intruder will have

absolutely no chance of finding out the private key even if he’s able to tap in the network he will

only the public key which is of no use to him unless he figures out the relative prime number.

Which is impossible to guess backwards.

3.4.2 TLS (TRANSPORT LAYER SECURITY)

The primary goal of TLS is to provide privacy and Data integrity between two communicating

applications. TLS works on two parts, the TLS record protocol and TLS handshake protocol. The

two main properties that TLS provides are -

(i) The connection is private. Symmetric Cryptography is used for data encryption like AES

and SCH. The keys for this symmetric encryption are generated uniquely for each

connection and are based on secretly exchanged by TLS handshake.

Page | 12


(ii) The connection is reliable. Message transport includes a message integrity check

using a keyed MAC. Secure Hash functions like SHA-1 are used to ensure this

process.

3.4.3 IP-IKE

Internet Key Exchange is mainly focused on performing mutual authentication and

establishing and maintaining security associations (SAs). The IKE is capable of running

unambiguously running over the same UDP port. IKE provides confidentiality, data integrity,

and access control for ip datagrams. This is maintained by establishing a shared source between

both the clients. Then the information regarding which cryptographic algorithm is used and keys

used to lock and unlock these are transferred over this securely established connection.

We can use any one or a combination of the optimal security mechanism that we think is apt for

the service that we think will best suit the needs for the type of communication that is going to be

used. With the knowledge that all the above discussed security mechanisms are fairly secure and

capable of protecting us from the common man-in the-middle attacks, we now move on to the

next section.

3.5 SYNTAX

Three new SIP header fields are introduced Security-Client, Security-Server and Security-Verify.

This is followed by mechanism-name, this token identifies the security mechanism supported by

the client when it appears this might be one among the following (tls for TLS, digest for HTTP

Digest, ipsec-ike for IPsec with IKE etc.). This is followed by the Preference section denoted by

the Q value. The higher the value of Q denotes how preferable the security mechanism is. No

two mechanisms can have the same Q value. Digest-algorithm, ‘qop’ and verify are optional

parameters for HTTP digest.

Page | 13


3.6 PROTOCOL OPERATION

This section deals with the protocol details involved in the negotiation between a SIP UA and its

next-hop SIP entity.

3.6.1 CLIENT INITIATED

A client wishing to use the security agreement of this specification must add a security-client

header field to a request addressed to its first hop proxy. In the header list of all security

mechanisms that the client supports are mentioned. The server that receives an unprotected

request will respond to the client with a 494(security agreement req) response. The server must

add its list of security mechanisms that the server supports. This is sent even if there are no

common security mechanisms between client’s list and the server’s list. When the client receives

the response it will choose the common security mechanism with the highest preference value Q.

The server should have indicated the necessary information so that the client can initiate that

mechanism. This initiation of a particular security mechanism might be carried out without

involving SIP message exchange. If an attacker modified the security-client header field in the

request the server may not respond to with required info to setup connection. A client detecting

such a lack of info will abort and try to start it again with a new request.

The requests must have a security-verify header field to check if the incoming requests match the

list of the servers supported security mechanisms. If modification of the list is detected the server

must respond to the client with a 494 response. If the list was not modified, and the server is a

proxy, it MUST remove the "sec-agree" value from both the require and Proxy-Require header

fields, and then remove the header fields if no values remain. Once the security has been

negotiated between two SIP entities, the same SIP entities may use the same security when

Page | 14


communicating with each other in different SIP roles. The user also has the provision to decline

a particular connection with a peer if the security mechanism is not satisfactory.

3.6.2 SERVER INITIATED

When a server receives a request from the network interface that is configured to use this

mechanism it checks if that request has only one via entry or many. If it has only one via entry it

processes the request or else it sends a 502(bad gateway response). If a server receives a request

that does not have a sec-agree option in a Require, Proxy-Require or a supported header field

must return a 421(extension req) response. If it has the sec-agree option then it must return a

494(security agreement req) response. The server must include the security-server header listing

its capabilities and require header field along with necessary information so that the client can

initiate the preferred security mechanism.

3.6.3 SECURITY MECHANISM INITIATION AND DURATION OF ASSOCIATION

Once the client confirms on which security mechanism to follow from the list available from the

security server header, it initiates that mechanism different mechanisms requires different

initiation procedure. The mechanism varies for tls, digest & ipsec-ike etc. Once the security

mechanism has been negotiated bot the server and the client should be aware of the termination

of the connection here too different mechanisms are used for indicating the termination of

different type of security mechanisms. For example when tls is used the termination is indicated

by a need for a new connection whereas In the case of IKE the duration of the connection is

negotiated at the very beginning of the session itself. And in the case of digest the connection is

valid till the client credentials are no longer valid.

Page | 15


3.7 SECURITY CONSIDERATIONS

Assuming that only secure algorithms are used, we still need to prevent MiTM attackers from

modifying important parameters such as whether encryption is provided or not. If the initial

offers are not protected by hashing or renegotiating these terms once more after the preferred

mode of security is really on then this would actually make it impossible for the middle man

attacker to figure out and modify both the offer message as well as the message that contains the

hash/repetition. Moreover to try to break the hash as well as the offer message in real time is an

actual impossibility. In this kind of a situation if a middle man attacker somehow finds out the

offer message and downgrades to lesser secure mechanism then the peer would receive a wrong

hash and so the existing connection could be torn down and a new connection could be set up.

Possible attacks and its counters.

1. Attackers could try to modify server’s list of security mechanisms in the first response.

This would be reviled to the server when client returns the list to server using security.

2. Attackers could try to modify repeated list in the second request from the client. This can

be countered if encryption is used by the selected security mechanism.

3. Attackers could try to modify the client’s list in the first message-even if the middle man

attacker changes tis it won’t much affect because the client will be aware of its own

capabilities.

4. Attackers may also try to reply old security agreement messages. This can be prevented

by providing reply protection and by using a time stamp to the nonce parameter and using

nonce counters.

Page | 16


4. H.323 - THE INTERNATIONAL STANDARD

The H.323 signaling protocol framework is the international telephony standard for all telephony

signaling over the packet network (not just the Internet). H.323 was designed for audio and video

conferencing, not for just point-to-point voice conversations.

Figure 5: H.323 Protocol Hierarchy

4.1 H.225.0 CALL SIGNALING

Once the address of the remote endpoint is resolved, the endpoint will use H.225.0 Call

Signaling in order to establish communication with the remote entity. H.225.0 messages are:

Setup and Setup acknowledge

Call Proceeding

Connect

Alerting

Information

Page | 17


Release Complete

Facility

Progress

Status and Status Inquiry

Notify

In the simplest form, an H.323 call may be established as follows.

In this example, the endpoint (EP) on the left initiated communication with the gateway on the

right and the gateway connected the call with the called party. In reality, call flows are often

more complex than the one shown, but most calls that utilize the Fast Connect procedures

defined within H.323 can be established with as few as 2 or 3 messages. Endpoints must notify

their gatekeeper (if gatekeepers are used) that they are in a call.

Once a call has concluded, a device will send a Release Complete message. Endpoints are then

required to notify their gatekeeper (if gatekeepers are used) that the call has ended.

4.2 SECURITY ISSUES FOR H.323

Firewalls pose particularly difficult problems for VOIP networks using H.323. With the

exception of the “Q.931-like” H.225, all H.323 traffic is routed through dynamic ports. For

H.323 Fast Start and H.245 tunneling just one channel (H.225 Call Signaling) is used. Usually

the call signaling is performed via port 1720. If additionally H.225 RAS communication is done

with the gatekeeper (UDP), this is done via port 1719. That is, each successive channel in the

Page | 18


protocol is routed through a port dynamically determined by its predecessor. This ad-hoc method

of securing channels does not lend itself well to a static firewall configuration. This is

particularly true in the case of stateless firewalls that cannot comprehend H.323 traffic. These

simple packet filters cannot correlate UDP transmissions and replies. This necessitates punching

holes in the firewall to allow H.323 traffic to traverse the security bridge on any of the ephemeral

ports it might use. This practice would introduce serious security weaknesses because such an

implementation would need to leave 10,000 UDP ports and several H.323 specific TCP ports

wide open [sample configuration provided in 1]. There is thus a need for a stateful firewall that

understands VOIP, specifically H.323. Such a firewall can read H.323 messages and dynamically

open the correct ports for each channel as the protocol moves through its call setup process. Such

a firewall must be part of a security architecture especially in scenarios where protocol–provided

security measures are applied, e.g. message integrity. Barring this, some kind of proxy server or

middle box would have to be used. Even with a VOIP-aware firewall, parsing H.323 traffic is not

a trivial matter. NAT is particularly troublesome for VOIP systems using the H.323 call setup

protocol. NAT complicates H.323 communications because the internal IP address and port

specified in the H.323 headers and messages themselves are not the actual address/port numbers

used externally by a remote terminal. This disrupts the “setup next” procedure used by each

protocol within the H.323 suite (e.g., H.225 setting up H.245). Not only does the firewall have to

comprehend this, but it is essential that the VOIP application receiving these H.323

communications receives the correct translated address/port numbers. Thus, if H.323 is to

traverse a NAT gateway, the NAT device must be able to reconfigure the addresses in the control

stream. So with NAT, not only does H.323 traffic need to be read, it must also be modified so

that the correct address/port numbers are sent to each of the endpoints.

Page | 19


5. MEDIA GATEWAY CONTROL PROTOCOL AND ITS SECURITY ISSUES

MGCP is used to communicate between the separate components of a decomposed VOIP

gateway. It is a complementary protocol to SIP and H.323. Within MGCP the MGC server or

“call agent” is mandatory and manages calls and conferences, and supports the services

provided.

There are no security mechanisms designed into the MGCP protocol itself. The informational

RFC 2705 refers to the use of IPsec to protect MGCP messages. Without this protection a

potential attacker could set up unauthorized calls or interfere with ongoing authorized calls.

Beside the use of IPsec, MGCP allows the call agent to provide gateways with session keys that

can be used to encrypt the audio messages, protecting against eavesdropping. The session key

will be used later on in RTP encryption. The RTP encryption, described in RFC 1889, may be

applied. Session keys may be transferred between the call agent and the gateway by using the

SDP.

6. ENCRYPTION AND IPSEC

Firewalls, gateways, and other such devices can help keep intruders from compromising a

network, but firewalls are no defense against an internal hacker. Another layer of defense is

necessary at the protocol level to protect the data itself. In VOIP, as in data networks, this can be

accomplished by encrypting the packets at the IP level using IPsec. This way if anyone on the

network, authorized or not, intercepts VOIP traffic not intended for them (for instance via a

packet sniffer), these packets will be unintelligible. The IPsec suite of security protocols and

encryption algorithms is the standard method for securing packets against unauthorized viewers

over data networks and will be supported by the protocol stack in IPv6. Hence, it is both logical

and practical to extend IPsec to VOIP, encrypting the signal and voice packets on one end and

Page | 20


decrypting them only when needed by their intended recipient. But the nature of the signaling

protocols and the VOIP network itself prevent such a simple scheme from being used, as it

becomes necessary for routers, proxies, etc. to read the VOIP packets. Also, several factors,

including the expansion of packet size, ciphering latency, and a lack of QoS urgency in the

cryptographic engine itself can cause an excessive amount of latency in the VOIP packet

delivery. This leads to degraded voice quality, so once again there is a tradeoff between security

and voice quality, and a need for speed. Fortunately, the difficulties are not insurmountable.

6.1 IPsec

IPsec is the preferred form of VPN tunneling across the Internet. There are two basic protocols

defined in IPsec: Encapsulating Security Payload (ESP) and Authentication Header (AH) (see

Figure 11). Both schemes provide connectionless integrity, source authentication, and an anti-

replay service. The tradeoff between ESP and AH is the increased latency in the encryption and

decryption of data in ESP and a “narrower” authentication in ESP, which normally does not

protect the IP header “outside” the ESP header, although IKE can be used to negotiate the

security association (SA), which includes the secret symmetric keys. In this case, the addresses

in the header (transport mode) or new/outer header (tunnel mode) are indirectly protected, since

only the entity that negotiated the SA can encrypt/decrypt or authenticate the packets. Both

schemes insert an IPsec header (and optionally other data) into the packet for purposes, such as

authentication. IP header and the new IPsec header are left in plain sight. So if an attacker were

to intercept an IPsec packet in transport mode, they could not determine what it contained; but

they could tell where it was headed, allowing rudimentary traffic analysis. On a network entirely

devoted to VOIP, this would equate to logging which parties were calling each other, when, and

for how long.

Page | 21


6.2 THE ROLE OF IPsec IN VoIP

The prevalence and ease of packet sniffing and other techniques for capturing packets on an IP

based network makes encryption a necessity for VOIP. Security in VOIP is concerned both with

protecting what a person says as well as to whom the person is speaking. IPsec can be used to

achieve both of these goals as long as it is applied with ESP using the tunnel method. This

secures the identities of both the endpoints and protects the voice data from prohibited users once

packets leave the corporate intranet. The incorporation of IPsec into IPv6 will increase the

availability of encryption, although there are other ways to secure this data at the application

level. VOIPsec (VOIP using IPsec) helps reduce the threat of man in the middle attacks, packet

sniffers, and many types of voice traffic analysis. Combined with the firewall implementations in

the previous chapter, IPsec makes VOIP more secure than a standard phone line, where people

generally assume the need for physical access to tap a phone line is deterrent enough. It is

important to note, however, that IPsec is not always a good fit for some applications, so some

protocols will continue to rely on their own security features.

6.3 DIFFICULTIES ARISING FROM VoIPSEC

IPsec has been included in IPv6. It is a reliable, robust, and widely implemented method of

protecting data and authenticating the sender. However, there are several issues associated with

VOIP that are not applicable to normal data traffic. Of particular interest are the Quality of

Service (QoS) issues, latency, jitter, and packet loss. These issues are introduced into the VOIP

environment because it is a real time media transfer, with only 150 ms to deliver each packet. In

standard data transfer over TCP, if a packet is lost, it can be resent by request. In VOIP, there is

no time to do this. Packets must arrive at their destination and they must arrive fast. Of course

the packets must also be secure during their travels, thus the introduction of VOIPsec.

Page | 22


Effect of VoIPsec on various QoS issues and its results developed by Cisco are –

Delay

Processing—PCM to G.729 to packet

Encryption — ESP encapsulation + 3DES

Serialization — time it takes to get a packet out of the router, each “hop” generally has

fixed delay.

1) IPsec overhead: about 40 bytes (depending configuration)

2) IP header: 20 bytes

3) UDP + RTP headers: 20 bytes

4) RTP header compression: 3 bytes for IP+UDP+RTP

Effects on 8 kbps CODEC (voice data: 20 bytes)

1) Clear text voice has an overhead of 3 bytes, which suggests required

bandwidth of approximately 9 kbps

2) IPsec encrypted

6.4 ENCRYPTION/DECRYPTION LATENCY

Encryption/decryption latency is a problem for any cryptographic protocol, because much of it

results from the computation time required by the underlying encryption. With VOIP’s use of

small packets at a fast rate and intolerance for packet loss, maximizing throughput is critical.

However, this comes with a price, because although DES is the fastest of these encryption

algorithms, it is also the easiest to crack. Current rules prohibit the use of DES for protection of

US Government information. Thus, designers are once again forced to toe the line between

security and voice quality. Two solutions to this problem are using faster encryption algorithms

Page | 23


and incorporating QoS into the crypto-engine. Latency is less of a problem for management

and/or signaling data than for voice channel traffic.

6.5 SCHEDULING AND THE LACK OF QOS IN THE CRYPTO-ENGINE

The crypto-engine is a severe bottleneck in the VOIP network. As just noted, the encryption

process has a debilitating effect on QoS, but this is not the highest degree factor in the

slowdown. Instead, the driving force behind the latency associated with the crypto-engine is the

scheduling algorithm for packets that entered the encryption/decryption process. While routers

and firewalls take advantage of QoS to determine priorities for packets, crypto-engines provide

no support for manual manipulation of the scheduling criteria. In ordinary data traffic this is less

of an issue because inordinately more packets pass through the router than the crypto-engine, and

time is not as essential. But in VOIP, a voluminous number of small packets must pass through

both the crypto engine and the router. Considering the time urgency issues of VOIP, the standard

FIFO scheduling algorithm employed in today’s crypto-engines creates a severe QoS issue.

6.6 EXPANDED PACKET SIZE

IPsec also increases the size of packets in VOIP, which leads to more QoS issues. It has been

shown that increased packet size increases throughput through the crypto-engine, but to conclude

from this that increased packet size due to IPsec leads to better throughput would be fallacious.

The difference is that the increase in packet size due to IPsec does not result in an increased

payload capacity. The increase is actually just an increase in the header size due to the

encryption and encapsulation of the old IP header and the introduction of the new IP header and

encryption information. This leads to several complications when IPsec is applied to VOIP. First,

the effective bandwidth is decreased as much as 63%. Thus connections to single users in low

Page | 24


bandwidth areas (i.e. via modem) may become infeasible. The bandwidth performance

reductions for various encryption algorithms are presented in. The size discrepancy can also

cause latency and jitter issues as packets are delayed by decreased network throughput or

bottlenecked at hub nodes on the network (such as routers or firewalls).

6.7 IPSEC AND NAT INCOMPATIBILITY

IPsec and NAT compatibility is far from ideal. NAT traversal completely invalidates the purpose

of AH because the source address of the machine behind the NAT is masked from the outside

world. Thus, there is no way to authenticate the true sender of the data. The same reasoning

demonstrates the inoperability of source authentication in ESP.

7. SOLUTIONS TO THE VOIPSEC ISSUES

7.1 ENCRYPTION AT THE END POINTS

One proposed solution to the bottlenecking at the routers due to the encryption issues is to handle

encryption/decryption solely at the endpoints in the VOIP network [33]. One consideration with

this method is that the endpoints must be computationally powerful enough to handle the

encryption mechanism. But typically endpoints are less powerful than gateways, which can

leverage hardware acceleration across multiple clients. Though ideally encryption should be

maintained at every hop in a VOIP packet’s lifetime, this may not be feasible with simple IP

phones with little in the way of software or computational power. In such cases, it may be

preferable for the data be encrypted between the endpoint and the router (or vice versa) but

unencrypted traffic on the LAN is slightly less damaging than unencrypted traffic across the

Internet. Fortunately, the increased processing power of newer phones is making endpoint

encryption less of an issue.

Page | 25


7.2 SECURE REAL-TIME PROTOCOL (SRTP)

The Secure Real-time Protocol is a profile of the Real-time Transport Protocol (RTP) offering

not only confidentiality, but also message authentication, and replay protection for the RTP

traffic as well as RTCP (Real-time Transport Control Protocol). SRTP provides a framework for

encryption and message authentication of RTP and RTCP streams.

SRTP provides increased security, achieved by

• Confidentiality for RTP as well as for RTCP by encryption of the respective payloads;

• Integrity for the entire RTP and RTCP packets, together with replay protection;

• The possibility to refresh the session keys periodically,

• An extensible framework that permits upgrading with new cryptographic algorithms;

• A secure session key derivation with a pseudo-random function at both ends;

• The usage of salting keys to protect against pre-computation attacks;

• Security for unicast and multicast RTP applications.

7.3 BETTER SCHEDULING SCHEMES

One solution implemented in the latest routers is to schedule the packets with QoS in mind prior

to the encryption phase. QoS prioritizing can also be done after the encryption process provided

your encryption procedures preserve the ToS bits from the original IP header in the new IPsec

header. This functionality is not guaranteed and is dependent on one’s network hardware and

software, but if it is implemented it allows for QoS scheduling to be used at every hop the

encrypted packets encounter. There are security concerns any time information on the contents of

a packet is left in the clear, including this ToS-forwarding scheme, but with the sending and

receiving addresses concealed, this is not as egregious as a cursory glance would make it seem.

Still neither the pre-encryption or post-encryption schemes actually implement QoS or any other

Page | 26


prioritizing scheme to enhance the crypto-engine’s FIFO scheduler. Speed and compactness

constraints on this device may not allow such algorithms to be applied for some time.

7.4 COMPRESSION OF PACKET SIZE

Compression of IPsec headers results in bandwidth usage comparable to that of plain IP. This in

turn results in considerably less jitter, latency, and better crypto-engine performance. The crypto-

engine performance also improves. There is, of course, a price for these speedups. The

compression scheme puts more strain on the CPU and memory capabilities of the endpoints in

order to achieve the compression, and, of course, both ends of a connection must use the same

compression algorithm. When packets are lost, they cannot be re-sent and the endpoints need to

resynchronize. However, the time saved in the crypto-engine and the security provided may be

well worth this price of this approach.

7.5 RESOLVING NAT/IPSEC INCOMPATIBILITIES

The most likely widespread solution to the problem of NAT traversal is UDP encapsulation of

IPsec. This implementation is supported by the IETF and effectively allows all ESP traffic to

traverse the NAT. In tunnel mode, this model wraps the encrypted IPsec packet in a UDP packet

with a new IP header and a new UDP header, usually using port 500. This port was chosen

because it is currently used by IKE peers to communicate so overloading the port does not

require any new holes to be punched in the firewall. This solution allows IPsec packets to

traverse standard NATs in both directions. The adoption of this standard method should allow

VOIPsec traffic to traverse NATs cleanly, although some extra overhead is added in the

encapsulation/decapsulation process. It is important to note that IP-based authentication is weak

compared with methods using cryptographic protocols.

Page | 27


8. CONCLUSION

VoIP is established as the future of voice calling. Security is critical when designing,

implementing and maintaining VoIP systems. It also deals with the security problems of the

traditional data network. VoIP just adds more assets, more threat, more locations and more

vulnerabilities to the data network, because of new equipment, protocols and processes on the

data network. To increase the security and performance it’s recommended to use VPNs to

separate VoIP from data traffic. VoIP specific firewalls should be deployed in voice network to

prevent malicious attacks. Striking a balance between security and the business needs of the

organization is key to the success of VoIP development.

REFERENCES AND BIBLIOGRAPHY

(i) RFC 3329 Security Mechanism for the SIP-authors J. Arkko, V. Torvinen, G.

Camarillo

(ii) NIST Special Publication 800-58 on Security Considerations for Voice over IP

Systems by D. Richard Kuhn, Thomas J. Walsh, Steffen Fries.

(iii) The Morgan Kaufmann Series in Networking on The Illustrated Networks written by Walter

Goralski.

(iv) TLS from RFC 5246 by T. Deriks

(v) IP-IKE from RFC 4306 by C.Kaufman

(vi) Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R.,

Handley, M. and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June

2002.

(vii) Kent, S. and R. Atkinson, "Security Architecture for the Internet Protocol", RFC

2401, November 1998.

Page | 28


(viii) Dierks, T. and C. Allen, P. Kocher, "The TLS Protocol Version 1.0", RFC 2246,

January 1999.

(ix) Digital Telecommunications, 2008. ICDT '08. The Third International Conference on

June 29 2008-July 5 2008.

(x) Paper on Voice over IP Security on February 2008 (Government of the HKSAR).

(xi) http://en.wikipedia.org/wiki/H.323

Page | 29

http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=4561265

Documents

VoIP Research Paper