2nd Speech Quality Test Event for Voice over IP

2nd Speech Quality Test Event for Voice over IP Frank Kettler, H.W. Gierlich, Frank Rosenberger; HEAD acoustics GmbH, Herzogenrath

Harald Klaus*, Jens Berger*, Oliver Eisfeldt*; * T-Systems Nova GmbH, Berlin Philippe Cousin**, Jean-Luc Freisse**; **ETSI, Sophia Antipolis (F)

1. Introduction The European Telecommunications Standards Institute (ETSI) organ-ized the 2nd ETSI TIPHON VoIP (Voice over Internet Protocol) Speech Quality Test Event in Sophia Antipolis, France, April 2002. T-Systems Nova GmbH Berkom and HEAD acoustics GmbH per-formed speech quality measurements on VoIP equipment of different manufacturers. Alcatel co-sponsored the test event.

The tests conducted during the 2nd Test Event have been designed by the two test labs to assess the one-way speech transmission quality as well as double talk situations, background noise transmission per-formance, packet loss concealment (PLC) implementations and echo cancelling characteristics. Manufactures were invited to bring their VoIP equipment for detailed speech quality evaluations. The tests were carried out anonymously. Each manufacturer received his own results plus the results of all other participants in anonymous form. [1].

Within ETSI, two bodies are actively involved in this event. ETSI Project TIPHON (Telecommunications and Internet Protocol Har-monization Over Networks) and ETSI Technical Committee STQ (Speech Processing, Transmission and Quality Aspects) look for the improvement of quality aspects in the voice transmission area.

2. Test Setup and Test Conditions Manufacturers participating in the Test Event could bring IP gateways or IP terminals. Gateways were measured using an electrical access (see Fig. 1), the quality tests for the IP phones were carried out on the acoustical interface using two HATS (Head and Torso Simulator ac-cording to ITU-T Recommendation P.58 [2]).

Gateway

IP

T1 / E1 / BRI

NISTNet

PacketMONITOR

T1 / E1 / BRI

Packet LossDelay

Output

HEAD acousticsTestsystem

ACQUAInput

GatewayIP IP

ISDN DSS1 ISDN DSS1

ALCATEL PBX 4400

Fig. 1: Measurement Scenario for

Electrical - Electrical Measurements

Condition Packet Loss (Equal) Additional Delay1 Delay Variation

1a 0 0 No

2a 1% 0 No

3a 2% 0 No

4a 3% 0 No

5a 5% 0 No

6a 1% 50 ms 20 ms (2)

Table 1: Network Conditions for Electrical – Electrical Measure-ments of Listening Quality using Speech Samples

For the measurements two kinds of input signals were used, speech samples designed according to ITU-T Recommendation P.800 [3] and

test signals according to ITU-T Recommendation P.501 [4]. The input signals were transmitted and recorded simultaneously. Therefore ex-act delay assessment was possible. For the connection of two gate-ways the test setup is given in figure 1. The estimation of one-way speech quality is based on real speech samples using the methods PESQ according to ITU-T Recommenda-tion P.862 [5] and TOSQA [6]. Note, that TOSQA has already been used during the 1st ETSI VoIP Speech Quality Test Event [7].

3. Test Results The figure 2 contains test results measured on gateway configurations. The results were averaged separately for the different types of codecs (e.g. G.711, G.729) and for the different test conditions. The individ-ual system settings (e.g. packet length, VAD on or off) may differ depending on the manufacturer’s implementation. At least results of three participants were necessary for the average process in order to guarantee the anonymity of each manufacturer involved in the averag-ing process. Three or more IP gateway implementations were tested with G.711 and G.729 codec. Consequently only these results are analyzed here. The different G.711 codec implementations use a packet-length of 20 ms or -in a few cases- 10 ms. The voice activity detection (VAD) was switched on for some tests and switched off for others. All G.711 codec implementations under test used an imple-mented packet loss concealment (PLC). The G.729 codec was chosen as a second common implementation by the most of the participants. Consequently sufficient measurement data were available to provide averaged and anonymized analyses. The codec implementations used a 20 ms packet length and -in some cases- a 10 ms packet length. Packet loss concealment was included, the voice activity detection was switched on or off respectively de-pending on the individual setting chosen by the manufactures in the specific test sessions. Figure 2 showing the averaged PESQ and TOSQA scores in direct comparison demonstrates the small, constant offset between PESQ and TMOS and also the high similarity of the results.

1.0

2.0

3.0

4.0

5.0

1a 2a 3a 4a 5a 6a 1a 2a 3a 4a 5a 6a

PA

BX

G.7

11

G.7

29

G.7

23@

6.3

G.7

23@

5.3

G.711 / 10&20ms / VAD on/off PLC on

G.729 / 10&20ms / VAD on/off PLC on

Electrical - Electrical: PESQ, TMOSPESQ,TMOS PESQ

TMOS

Fig. 2: PESQ and TOSQA Results These results were averaged over all manufacturers and all implemen-tations. Each manufacturer received his individual result and can therefore compare the quality of his current implementation to the other manufacturers.

724

In order to provide additional analysis for optimization purposes dur-ing the consulting part of the test event, detailed tests were carried out by HEAD acoustics using specific test signals according to ITU-T Recommendation P.501 [4]. The following analyses concentrate on the PLC and jitter buffer implementation in order to analyze cur-rent implementations with their specific, audible disturbances. The test signal used is a periodical repetition of a voiced sound (dura-tion 5 s). Two kinds of analysis are applied:

• Relative Approach analysis [8], a hearing model [9] based on a psychoacoustic analysis method to analyze audible dis-turbances in the time and frequency domain [10] and

• cross correlation analysis between the transmitted signal and the original test signal to show the technical implementation of PLC and jitter buffer design.

In figure 3 and 4 the time signals are shown in the upper window of each figure, the results derived from the Relative Approach analysis in the middle window and the cross correlation in the lower window. Figure 3 shows the analysis result for a PBX as reference without IP gateways connected. For the example in Figure 4 it can be noticed that the substitution of the – lost – packets leads to audible disturbances mainly in the lower frequency range as demonstrated by the Relative Approach analysis. This indicates that the signal interpolation between the transmitted test signal and the substituted packet could be optimized.

Fig. 3: Reference PBX

connection Fig. 4: PLC implementation

(G.711 codec) The combination of these analysis methods (cross correlation, Rela-tive Approach) with their specific results together with the PESQ and TMOS value and the listening examples provided on CD for each manufacturer may help to improve the current implementations. The tests of the implemented echo cancellers under single and dou-ble conditions were carried out in different setups: a complete 4-wire scenario with infinite ERL and with a simulated echo path attenuation of 40 dB ERL respectively 6 dB ERL. The echo path simulation was provided by the measurement frontend MFE VI controlled by the test system ACQUA (HEAD acoustics). The setup is shown in Figure 5 together with a specific test signal consisting of a periodical repetition of Composite Source Signal bursts. As a reference connection the same tests were carried out with the PBX stand alone.

IP

Packet LossDelay

GatewayGateway

IPPBX PBX PBX PBX ISDN

Simulator

ISDN

SimulatorISDN

Simulator

ISDN

Simulator

InRCV

OutRCV

InSND

OutSND

NISTNet

USB

MFE VI

Test SystemACQUA

ERL 6 dB, ERL 6 dB, ERL 40 dB and ERL 40 dB and infiniteinfinite

IP

Packet LossDelay

GatewayGatewayGateway

IPPBX PBX PBX PBX ISDN

Simulator

ISDN

SimulatorISDN

Simulator

ISDN

Simulator

InRCV

OutRCV

InSND

OutSND

NISTNet

USB

MFE VI

Test SystemACQUA

ERL 6 dB, ERL 6 dB, ERL 40 dB and ERL 40 dB and infiniteinfinite

EC EC under under testtest

Fig. 5: Echo Cancellers – Double Talk Performance The analysis result for the PBX reference connection (figure 6) dem-onstrates that all signal bursts applied at the near end (see figure 5 with the green colored bursts in the test signal) are completely trans-

mitted. There is no level difference between the original signal and the transmitted signal. This could be expected for the reference con-nection. At the beginning and at the end of the analyzed double talk sequence in Figure 7, the signal bursts are completely transmitted or attenuated by approximately 6 dB. In the middle sequence the signal bursts ap-plied with the lower levels are clipped. Note that the test signal level in receiving direction is very high during this middle part, whereas the near end signal level is low. It can be expected, that clipping occurs also during the application of speech for this implementation.

Fig. 6: PBX Reference Connec-

tion DT Performance Fig. 7: DT Performance IP

Gateway Connection

Results of other manufacturers including more detailed analysis can be found in the published anonymized test report [1].

4. Summary This test event was very successful and useful as indicated by the feedback form from the participating manufacturers. The combination of specific analysis methods like PESQ and TOSQA leading to a con-dense result under single talk conditions in combination with the de-tailed analysis methods as described above to optimize those imple-mentations and parameters being responsible for the results is ex-tremely useful. Moreover, the assessment of all conversational aspects and background noise transmission quality give useful hints in order to optimize the systems to provide a better speech quality under all conversational aspects.

5. Literature [1] Anonymized Report of the 2nd ETSI TIPHON VoIP Speech

Quality Test Event, April 2002, Sophia Antipolis, France [2] ITU-T Rec. P.58, Head and Torso Simulators for Tele-

phonometry [3] ITU-T Rec. P.800 : Methods for Subjective Determination

of Transmission Quality [4] ITU-T Rec. P.501, Test Signals for Use in Telephonometry [5] ITU-T Rec. P.862, Perceptual Evaluation of Speech Quality

(PESQ), an Objective Method for End-to-end Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs

[6] ITU-T Contribution, Results of objective speech quality assessment including receiving terminals using the advanced TOSQA2001, COM 12-20-E, December 2000, T-Nova Deutsche Telekom Innovationsgesellschaft mbH Berkom

[7] Gierlich, H.W.; Kettler, F.; Berger, J.; Klaus, H.; Kliche, I.; Scheerbarth, Th., Report of 1st ETSI VoIP Speech Quality Test Event, ETSI EP TIPHON # 22, 21.–28.03.2001, Be-thesda, USA

[8] Genuit, K.: Objective Evaluation of Acoustic Quality Based on a Relative Approach, InterNoise '96, Liverpool, UK

[9] Sottek, R.: Modelle zur Signalverarbeitung im menschlichen Gehör, PHD thesis RWTH Aachen, 1993

[10] Kettler, F.; Gierlich, H.W.; Rosenberger, F., Application of the Relative Approach to Optimize Packet Loss Conceal-ment Implementations, DAGA 2003, 18.-20.03.2003, Aachen

725

Documents

2nd Speech Quality Test Event for Voice over IP