Fault and Performance Management for Next Generation IP Communication Alan Clark, Telchemy Fault and...

Preview:

Citation preview

Fault and Performance Management for

Next Generation IP Communication

Alan Clark, Telchemy

Fault and Performance Management for

Next Generation IP Communication

Alan Clark, Telchemy

Outline

• Problems affecting VoIP performance• Tools for Measuring and Diagnosing

Problems• Protocols for Reporting QoS• Performance Management Architecture• What to ask for/ integrate?

Enterprise VoIP Deployment

Branch Office

IP Phone

IP VPN

IP Phone

Teleworker

IP Phones

Gateway

VoIP Deployment - Issues

IP Phone

IP VPN

IP Phone

IP Phones

GatewayECHO

ACCESSLINKCONGESTION

LAN CONGESTION,DUPLEX MISMATCH,LONG CABLES….

ROUTEFLAPPING,LINK FAIL

CODECDISTORTION

Call Quality Problems

• Packet Loss• Jitter (Packet Delay Variation)• Codecs and PLC• Delay (Latency)• Echo• Signal Level• Noise Level

Packet Loss and Jitter

CodecIPNetwork

JitterBuffer

Packets lostin network

Packets discardeddue to jitter

DistortedSpeech

Routers, Loss and Jitter

Arrivingpackets

Outputqueue

Prioritize/Route

Voice packet delayedby one or more datapackets

Queuing delay

Serialization delay

Packet loss due to bufferOverflow or RED

Inputqueue

Queuing delay

Processing delay

Queuing Delays

0

25

50

75

100

125

150

175

200

0 500 1000 1500 2000

Transmission speed (kbits/ s)

Max d

ela

y (

mS

)

1 x 1500 byte MTU

2 x 1500 byte MTU

3 x 1500 byte MTU

Added delay due towait for data packetsto be sent = Jitter

Jitter

50

75

100

125

150

0 0.5 1 1.5 2

Time (Seconds)

Dela

y (

mS)

Average jitter level (PPDV) = 4.5mSPeak jitter level = 60mS

WiFi can also cause jitter

0

50

100

150

200

250

300

Time

Dela

y (

mS

) &

RS

SI RSSI

Delay

Effects of Jitter

• Low levels of jitter absorbed by jitter buffer• High levels of jitter

o lead to packets being discardedo cause adaptive jitter buffer to grow - increasing delay but

reducing discards

• If packets are discarded by the jitter buffer as they arrive too late they are regarded as “discarded”

• If packets arrive extremely late they are regarded as “lost” hence sometimes “lost” packets actually did arrive

Packet Loss

0

10

20

30

40

50

30 35 40 45 50 55 60 65 70

Time (seconds)

50

0m

S A

vge P

acket L

oss R

ate

Average packet loss rate = 2.1%Peak packet loss = 30%

Packet Loss is bursty

• Packet loss (and packet discard) tends to occur in sparse bursts - say 20-30% in density and one second or so in length

• Terminologyo Consecutive bursto Sparse bursto Burst of Loss vs Loss/Discard

0

50

100

150

200

0 100 200 300 400 500Burst length ( packets)

Bu

rst

we

igh

t (

pa

ck

ets

)

Example Packet Loss Distribution

20 percent burst density (sparse burst)

Cons

ecut

ive

loss

Loss and Discard

• Loss is often associated with periods of high congestion

• Jitter is due to congestion (usually) and leads to packet discard

• Hence Loss and Discard often coincide

• Other factors can apply - e.g. duplex mismatch, link failures etc.

Example Loss/Discard Distribution

0

50

100

150

200

0 100 200 300 400 500

Bur st le ngth ( pa cke ts)

Bu

rst

we

igh

t (

pa

ck

ets

)

Leads To Time Varying Call Quality

1

2

3

4

5

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Time

MO

S

0100200300400500

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Ban

dw

idth

(kb

it/

s) Voice

Data

High jitter/ loss/ discard

Packet Loss Concealment

• Mitigates impact of packet loss/ discard by replacing lost speech segments

• Very effective for isolated lost packets, less effective for bursty loss/discard

• But isn’t loss/discard bursty?• Need to be able to deal with 10-20-30%

loss!!!

Estimated by PLC

Effectiveness of PLC

1

2

3

4

5

0 5 10 15 20

Packet Loss/ Discard Rate

AC

R M

OS

G.711 no PLCG.711 PLCG.729A

Codecdistortion Impact of loss/

discard and PLC

Call Quality Problems

• Packet Loss• Jitter (Packet Delay Variation)• Codecs and PLC• Delay (Latency)• Echo• Signal Level• Noise Level

Effect of Delay on Conversational Quality

1

2

3

4

5

0 100 200 300 400 500 600

Round trip delay (milliseconds)

MO

S S

core

55dB Echo Return Loss

35dB Echo Return Loss

Causes of Delay

CODEC Echo Control

RTP

IPUDPTCP

CODEC Echo Control

RTP

IPUDPTCP

External delayAccumulate and encode

Network delay Jitter buffer, decode and playout

Cause of Echo

IP

EchoCanceller

Gateway

LineEchoRound trip delay - typically 50mS+

Additional delay introduced by VoIP makes existing echo problems more obvious

Also - “convergence” echo

AcousticEcho

Echo problems

• Echo with very low delay sounds like “sidetone”

• Echo with some delay makes the line sound hollow

• Echo with over 50mS delay sounds like…. Echo

• Echo Return Loss o 55dB or above is goodo 25dB or below is bad

Call Quality Problems

• Packet Loss• Jitter (Packet Delay Variation)• Codecs and PLC• Delay (Latency)• Echo• Signal Level• Noise Level

Signal Level Problems

Temporal Clipping occurs with VAD or Echo Suppressors -- gaps in speech, start/end of words missing

Amplitude Clipping occurs -- speech sounds loud and “buzzy”

0 dBm0

-36 dBm0

Noise

• Noise can be due too Low signal levelo Equipment/ encoding (e.g. quantization noise)o External local loopso Environmental (room) noise

• From a service provider perspective - how to distinguish between o room noise (not my problem)o Network/equipment/circuit noise (is my problem)

Measuring VoIP performance

VQmon

ITU G.107ITU P.862 (PESQ)

VQmon

ITU P.VTQITU P.563

Active Test- Measure test calls

Passive Test- Measure live calls

VoIP SpecificAnalog signal based

“Gold Standard” - ACR Test

• Speech materialo Phonetically balanced speech samples 8-10 seconds in lengtho Test designed to eliminate bias (e.g. presentation order different for

each listener)o Known files included as anchors (e.g. MNRU)

• Listening conditionso Panel of listenerso Controlled conditions (quiet environment with known level of

background noise)

23 2

4

Example ACR test results

• Extract from an ITU subjective test

• Mean Opinion Score (MOS) was 2.4

• 1=Unacceptable• 2=Poor• 3=Fair• 4=Good• 5=Excellent

0

10

20

30

40

50

Votes

1 2 3 4 5

Opinion Score

Packet based approaches

VoIPTest

System

VoIPTest

SystemIP

VoIPEnd

System

VoIPEnd

SystemIP

PassiveTest

PassiveTest

Measurecall

Test Call

Live CallVQmon,G.107.P.VTQ

Packet based approaches

• ITU G.107 R = Ro - Is - Ie - Id + Ao Really a network planning toolo Missing many essential monitoring features

• VQmono ITU G.107 + ETSI TS 101 329-5 Annex E +…….o Proprietary but widely used (Superset of G.107 &

P.VTQ)

• ITU P.VTQ o Available late 2005, very limited functionality

Extended E Model - VQmonArrivingpackets

Discarded

CODEC

Jitterbuffer

Loss/ Discardevents

MetricsCalculation

4 State Markov ModelGather detailedpacket loss infoin real time

Signal levelNoise levelEcho level

Call Quality ScoresDiagnostic Data

Modeling transient effects

10 15 20 25 30 35Time (seconds)

MeasuredCall quality

User ReportedCall quality

Ie(gap)

Ie(burst)

Ie(VQmon)

VQmon - computational modelBurst lossrate

Gap lossrate

Ie mapping

Perceptual model

CalculateR-LQMOS-LQ

CalculateRo, Is

Signal levelNoise level

CalculateId

EchoDelay

CalculateR-CQMOS-CQ

Recencymodel

ETSI TS101 329-5

ITU-T G.107

Accuracy: Non-bursty conditions

Com pa rison of V Qm on v s ACR M OS - I LBC 1 5 .2 k

1

1.5

2

2.5

3

3.5

4

4.5

5

0 5 10 15 20

Pa cke t Loss Ra t e ( % )

MO

S S

co

re

ACR MOS

VQm on MOS- LQ

Com pa rison of V Qm on v s PESQ - I LBC 1 5 .2 k

1

1.5

2

2.5

3

3.5

4

0 5 10 15 20 25 30

Pa cke t Loss Ra t e ( % )

PE

SQ

Sc

ore

PESQ

VQm on MOS- PQ

1.5

2

2.5

3

3.5

4

1.5 2 2.5 3 3.5 4

ACR MO S

Es

tim

ate

d M

OS

Accuracy: Bursty conditions

• G.107o Well established model for

network planningo No way to represent jittero Few codec modelso Inaccurate for bursty losso Conversational Quality only

• VQmono Extended G.107o Transient impairment modelo Wide range of codec modelso Narrow & Widebando Jitter Buffer Emulatoro Listening and

Conversational Quality

VQmon

E Model

Comparison of VQmon and E Modelfor severely time varying conditions

Signal based approaches

VoIPEnd

System

VoIPEnd

SystemIP

VoIPEnd

System

VoIPEnd

SystemIP

P.862TesterTest Call

P.563Tester

P.862 is an Active Test Approach

P.563 is a Passive Test Approach

ITU P.862 - Active testing

IP

Timealign

Audiofiles

FFT…

FFT…

ComparePESQScore

Tested segment of connection

PESQ

ITU P.862 - Active testing

• Send speech file

• Compare received file with original using FFT

• Takes typically 50-100 MIPS per call

• MOS-like score in the range -0.5 to 4.5

• Widely used within the industry

1

1.5

2

2.5

3

3.5

4

0 5 10 15 20 25 30 35 40

Pa cke t Loss Ra t e

PE

SQ

Sc

ore

s

Results for G.729A codec for a set ofspeech files (i.e. for each packet lossrate the only thing changed is the speechsource file)

ITU P.563 - Passive monitoring

• Analyses received speech file (single ended)

• Produces a MOS score

• Correlates well with MOS when averaged over many calls

• Requires 100MIPS per call

1 .0 0

2 .0 0

3 .0 0

4 .0 0

5 .0 0

1 2 3 4 5

P5 6 3 Scor e

AC

R M

OS

Comparison of P.563 estimated MOS scores with actual ACR test scores.Each point is average per file ACR MOS with 16listeners compared to P.563 score

Performance Monitoring - Passive Test

RTCP XR

SIP QoSReport

EmbeddedMonitoringFunction

SLA Monitoring - Active Test

Active Test Functions

Test call

Active or Passive Testing?

• Active testing o works for pre-deployment testing and on-demand

troubleshooting

• But!!!!o IP problems are transient

• Passive monitoring o Monitors every call made - but needs a call to monitoro Captures information on transient problemso Provides data for post-analysis

• Therefore - you need both

VoIP Performance Management Framework

Media Path Reporting(RTCP XR)

Call Server andCDR database

VoIPEndpoint

VoIPGateway

SNMPReporting

NetworkManagementSystem

Signaling Based QoS Reporting

Embedded Monitoring

Network Probe,Analyzer orRouter

VQVQ

Embedded Monitoring

VQ

RTP stream (possibly encrypted)

VoIP Performance Management Framework

• Embedded monitoring function in IP phones, residential gateways….

o Close to the usero Least cost + widest coverage

• Protocol support developedo RTCP XR (RFC3611), SIP, MGCP, H.323, Megacoo Draft SNMP MIB

• Works in encrypted environments• Already being deployed by equipment vendors

The role of RTCP XR

RTCP XR (RFC3611)

1. Provides a useful set of metrics for VoIP performance monitoring and diagnosis

2. Supports both real time monitoring and post-analysis

3. Extracts signal level, noise level and echo level from DSP software in the endpoint

4. Exchanges info on endpoint delay and echo to allow remote endpoint to assess echo impact

5. Provides midstream probes/ analyzers access to analog metrics if secure RTP is used

6. Goes through firewalls………

RFC3611 - RTCP XR

Loss Rate Discard Rate Burst Density Gap Density

Burst Duration (mS) Gap Duration (mS)

Round Trip Delay (mS) End System Delay (mS)

Signal level RERL Noise Level Gmin

R Factor Ext R MOS-LQ MOS-CQ

Rx Config - Jitter Buffer Nominal

Jitter Buffer Max Jitter Buffer Abs Max

SIP Service Quality Reporting Event

PUBLISH sip:collector@example.com SIP/2.0

Via: SIP/2.0/UDP pc22.example.com;branch=z9hG4bK3343d7 ……… Content-Type: application/rtcpxr Content-Length: ...

VQSessionReportLocalMetrics:TimeStamps=START:10012004.18.23.43 STOP:10012004.18.26.02SessionDesc=PT:0 PD:G.711 SR:8000 FD:20 FPP:2 PLC:3 SSUP:on

CallID=1890463548@alice.uac.chicago.com ………Signal=SL:2 NL:10 RERL:14QualityEst=RLQ:90 RCQ:85 EXTR:90 MOSLQ:3.4 MOSCQ:3.3

QoEEstAlg:VQMonv2.1DialogID:38419823470834;to-tag=8472761;from-tag=9123dh311

RTCP XR MIB

Session table

Basic parameters

Call quality metrics

History table

Alerting

Passive Monitoring Framework

Branch Office

IP Phone

IP VPN

IP Phone

Teleworker

VQ

IP Phones

Gateway

NMS

VQ

VQ

VQ

VQ

VQ

VQ

VQ

VQ

VQ

VQ

VQ

RTCP XR

SIP QoS Report

SNMP

What to Implement/ Ask For

• Embedded monitoring functionality in IP Phones and Gateways (e.g. VQmon)

• RTCP XR for mid-call data exchange between endpoints

• SIP Service Quality Events for reporting end of call quality

• RTCP XR MIB for SNMP support

Summary

• Problems affecting VoIP performance• Tools for Measuring and Diagnosing

Problems• Protocols for Reporting QoS• Performance Management Architecture• What to ask for/ integrate?

Recommended