Manual - NiNA+ Voice Test Result Description

  • Upload
    cahit

  • View
    35

  • Download
    8

Embed Size (px)

DESCRIPTION

Manual - NiNA+ Voice Test Result Description

Citation preview

  • A Rohde & Schwarz Company

    NiNA+Voice Measurement Description Manual

    June 2012

    SwissQual License AG Allmendweg 8 CH-4528 Zuchwil Switzerland

    t +41 32 686 65 65 f +41 32 686 65 66 e [email protected] www.swissqual.com

    Part Number: 16-100-200425 REV 1

  • SwissQual has made every effort to ensure that eventual instructions contained in the document are adequate and free of errors and omissions. SwissQual will, if necessary, explain issues which may not be covered by the documents. SwissQuals liability for any errors in the documents is limited to the correction of errors and the aforementioned advisory services.

    Copyright 2000 - 2012 SwissQual AG. All rights reserved.

    No part of this publication may be copied, distributed, transmitted, transcribed, stored in a retrieval system, or translated into any human or computer language without the prior written permission of SwissQual AG.

    Confidential materials.

    All information in this document is regarded as commercial valuable, protected and privileged intellectual property, and is provided under the terms of existing Non-Disclosure Agreements or as commercial-in-confidence material.

    When you refer to a SwissQual technology or product, you must acknowledge the respective text or logo trademark somewhere in your text.

    SwissQual, Seven.Five, SQuad, QualiPoc, NetQual, VQuad, Diversity as well as the following logos are registered trademarks of SwissQual AG.

    Diversity Explorer, Diversity Ranger, Diversity Unattended, NiNA+, NiNA, NQAgent, NQComm, NQDI, NQTM, NQView, NQWeb, QPControl, QPView, QualiPoc Freerider, QualiPoc iQ, QualiPoc Mobile, QualiPoc Static, QualiWatch-M, QualiWatch-S, SystemInspector, TestManager, VMon, VQuad-HD are trademarks of SwissQual AG.

    SwissQual acknowledges the following trademarks for company names and products:

    Adobe, Adobe Acrobat, and Adobe Postscript are trademarks of Adobe Systems Incorporated.

    Apple is a trademark of Apple Computer, Inc.

    DIMENSION, LATITUDE, and OPTIPLEX are registered trademarks of Dell Inc.

    ELEKTROBIT is a registered trademark of Elektrobit Group Plc.

    Google is a registered trademark of Google Inc.

    Intel, Intel Itanium, Intel Pentium, and Intel Xeon are trademarks or registered trademarks of Intel Corporation.

    INTERNET EXPLORER, SMARTPHONE, TABLET are registered trademarks of Microsoft Corporation.

    Java is a U.S. trademark of Sun Microsystems, Inc.

    Linux is a registered trademark of Linus Torvalds.

    Microsoft, Microsoft Windows, Microsoft Windows NT, and Windows Vista are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries U.S.

    NOKIA is a registered trademark of Nokia Corporation.

    Oracle is a registered US trademark of Oracle Corporation, Redwood City, California.

    SAMSUNG is a registered trademark of Samsung Corporation.

    SIERRA WIRELESS is a registered trademark of Sierra Wireless, Inc.

    TRIMBLE is a registered trademark of Trimble Navigation Limited.

    U-BLOX is a registered trademark of u-blox Holding AG.

    UNIX is a registered trademark of The Open Group.

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Contents | CONFIDENTIAL MATERIALS

    ii

    Contents 1 Introduction .......................................................................................................................................... 1

    2 Listening Quality .................................................................................................................................. 2

    Introduction ............................................................................................................................................ 2 The Definition of Listening Quality ......................................................................................................... 2 Subjective and Objective Quality assessment ...................................................................................... 2 Assessment of Intrusive-/Non-Intrusive Calls ........................................................................................ 3

    3 NiNA+ Network Quality Assessment ................................................................................................. 5

    Introduction Why NiNA+ ..................................................................................................................... 5 Technical Background of NiNA+............................................................................................................ 5 Technical requirements and performance ............................................................................................. 5 Measurement results of NiNA+.............................................................................................................. 8

    Figures Figure 2-1 Subjective versus objective quality assessment .............................................................................. 3 Figure 3-1 NiNA+ Listening Quality values for noise-free speech transmissions ............................................. 7 Figure 3-2 NiNA+ Listening Quality values in GSM connections using real handsets ...................................... 8 Figure 3-3 Example of NiNA+ measurements shown in NQDI ......................................................................... 9 Figure 3-4. Average NiNA+ results .................................................................................................................. 10 Figure 3-5 Signal Envelope [dB] (Received Speech Signal) ........................................................................... 12 Figure 3-6 Time Domain Chart (Received Speech Signal) ............................................................................. 12

    Tables Table 3-1 Correlation coefficients between MOS values obtained in auditory tests and scores of NiNA+ ....... 6 Table 3-2 Typical MOS values of auditory tests and NiNA+ ........................................................................... 11

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 1 | Introduction CONFIDENTIAL MATERIALS

    1

    1 Introduction This document describes the technical background, the application scenarios as well as the parameters that are measured with the single ended NiNA+ voice quality measurement. The application used was the SwissQual QoS Measurement System, the screenshots are made from the SwissQual Post Processing System NQDI.

    NiNA+ provides an opportunity for assessing the signal quality of a signal transmitted via a telecommunications network without the knowledge of the originally transmitted signal. The speech quality is determined by only using the output signal. SwissQuals NiNA+ solution can be applied for rating of any arbitrary connection where a self-answering far-end side is playing back human speech (e.g. weather forecast or similar). Since, NiNA+ can be applied on the mobile unit, the radio link forms part of the tested connection. Of course, by using NiNA+ any fixed line connection, even Voice over IP, can be rated.

    Furthermore, the NiNA+ method is not restricted to end-to-end measurements; it can be used at any arbitrary location in the transmission chain. It can be used for quality monitoring at any electrical measuring point within a real established voice link (e.g. in a VoIP Gateway or a at an E1/T1 interface). The calculated score reflects the true speech quality from the perspective of the end-user as if using a conventional shaped handset at this measuring point.

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 2 | Listening Quality CONFIDENTIAL MATERIALS

    2

    2 Listening Quality

    Introduction For network operators or equipment manufacturers, it is important to know where and why there is speech quality degradation. Since listening quality is a major factor determining customer satisfaction, encoding techniques must be designed for optimal speech quality. In order to assess the quality of speech encoding techniques, large-scale auditory tests are commonly employed. However, it is very difficult to reproduce results obtained in such a way. Furthermore, such results are depending on the level of motivation of the individual test candidates. It is, therefore, a big advantage to have an automated method capable of physically measuring speech quality parameters and producing results, which correlates as closely as possible with subjectively acquired results.

    Listening quality is a vague term compared with bit rate, echo or loudness. Since customer satisfaction can be measured directly by the quality of the transmitted speech, encoding techniques must be selected and optimized based on their listening quality.

    The Definition of Listening Quality Listening Quality is defined as a measure of a listeners satisfaction based on his experience and expectation regarding voice communication. It is generally expressed as a Mean Opinion Score (MOS). The Listening Quality is usually measured by applying Absolute Category Rating Tests (ACR), which shows the MOS on a scale from 1 (bad) to 5 (excellent).

    This measurement denotes the average of many individual opinions on speech quality, which are obtained from a representative number of listeners. Listening quality is a complex psycho-acoustic phenomenon within the process of human perception. As such, it is a subjective measurement.

    Listening Quality is the main factor for a perceived overall quality in speech telecommunications. However, as listed below, Listening Quality is only one of three dimensions determining the overall speech quality of a telephone call:

    Listening Quality: covers the listening situation between the two calling parties, where one party is talking and the other party is listening (non active).

    Talking Quality: perceived quality by the talker during own speech activity (mainly influenced by echoes and side tones)

    Conversational Quality: perceived overall quality in a human conversation. It combines Listening and Talking Quality together with signal delay and double talk interferences..

    Detailed definitions of these dimensions and test scenarios for auditory tests can be found in ITU-T P.800 series.

    Subjective and Objective Quality assessment Assessing the quality of a telecommunication network is an important instrument for achieving and maintaining the required service quality. One method of assessing the service quality of a telecommunications network involves determining the quality of a signal transmitted via the telecommunications network. Therefore a test connection has to be established and a signal will be transmitted from A to B. In the case of audio signals and in particular voice signals, several of these so-called intrusive or double ended procedures are used for this purpose. As the name suggests, such procedures intervene in the system to be tested in such a way that a transmission channel is allocated and a reference signal is transmitted along it.

    The transmitted speech signal can be collected and assessed in 2 ways:

    Subjective assessment: This is where test persons conduct subjective auditory tests, either comparing the received signal with the known reference signal or rate the received signal by their own experience and

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 2 | Listening Quality CONFIDENTIAL MATERIALS

    3

    expectation This procedure is, however, very time consuming and therefore expensive.

    Objective assessment: An automated speech quality assessment method making:

    an evaluation and rating of the received signal compared to the known reference (double-ended method and intrusive, requires a testcall), or

    an evaluation and rating is conducted on the received signal alone. (single-ended method, might be a test call to a answering machine or even live monitoring)

    The basic relationship between subjective /objective assessments and double-ended/single-ended is shown in Figure 2-1.

    Figure 2-1 Subjective versus objective quality assessment

    Assessment of Intrusive-/Non-Intrusive Calls With reference to objective speech quality testing, the Intrusive and Non-intrusive methods can be used in several application scenarios. Namely, the test options are as follows:

    Intrusive and double-ended: Both ends of the connection are under control and a defined audio signal will be transmitted in this test connection.

    Non-intrusive In-service Monitoring: Assessment of speech signals in real human conversation by parallel monitoring (e.g. at E1/T1 interface or VoIP-Gateway)

    Intrusive and single-ended: A test connection will be established to any answering station which is playing back a voice signal (e.g. weather forecast). Here the same model is applied as the Non-intrusive In-service Monitoring.

    Intrusive and double-ended Speech Quality Assessment: Here the methods, which require a known reference signal, will be applied normally. Both ends of the connection are under control and a pre-defined voice-signal will be transmitted.

    This approach generally has the disadvantage that, it is necessary to intervene in the network to be tested. This means, to determine the signal quality, at least one transmission channel must be occupied for the reference signal to be transmitted on it. This transmission channel cannot be used for data transfer purposes

    Methods requiring a reference signal

    Network under test

    Human listener

    reference speech signal

    transmitted speech signal

    Experience expectation semantic

    Quality rating

    Quality rating

    Methods requiring NO reference

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 2 | Listening Quality CONFIDENTIAL MATERIALS

    4

    during this period of time. In addition, although in a broadcasting system such as a radio service, for example, it is in principle possible to assign the signal source for transmitting test signals, however, since all channels are consequently occupied and the test signal would be transmitted to all receivers, this procedure is extremely impractical. Also, Intrusive procedures are likewise unsuitable for the purpose of simultaneously monitoring the quality of a large number of transmission channels.

    Of course, the advantages of the double ended method, is that the input signal or reference signal is known, this allows for very accurate and detailed analysis of voice quality impairments. Each change in the signal during its transmission can be detected and be proven for its impact on perceived quality by applying psycho-acoustic models. Such models are well applicable for optimization processes in laboratories as well as in real networks. They are able to predict even the minimal degradations of the signals and can be applied to compare different or similar transmission scenarios.

    Non-intrusive and single-ended Speech Quality Assessment: Models assessing speech quality without a pre-defined reference speech signal, which has to be transmitted, often called non-intrusive or single-ended models. These models analyse the transmitted and maybe distorted speech without any possibility to compare it with a separate input or known reference signal. Therefore, no reference input signal is available for a detailed comparison.

    The single ended models often look for pre-defined distortions by applying conventional signal analysis methods. This means, they are looking for background noises, interruptions, frame repeats and so on. More advanced solutions try to reconstruct a reference speech signal from the distorted one and apply similar psycho-acoustic based methods for comparisons like the intrusive and double-ended methods.

    Of course, the accuracy of a single ended approach is lower than that of an intrusive and double ended approach. However, due to the advanced integrated speech extraction and the psycho-acoustic based calculations, the single-ended approach is now accurate enough to be applied in real environments.

    A non-intrusive, single-ended algorithm has two base applications, namely:

    In-Service Monitoring: Here the speech signal of a real conversation will be assessed. This can be done with a terminal or maybe more efficient at the PBX side at an E1/T1 link or even in a VoIP Gateway. The advantages are two-fold:

    the ability to collect a large amount of measurement data without allocating network resources and Gain a more realistic overview about the speech quality as perceived by the subscribers. This is because

    the impact to speech quality coming from the sending side (e.g. Background noise) is included in the measurement and end result.

    NiNA+ will be connected at an electrical interface, therefore the real acoustical environment of the listener cannot be measured, instead a modelled handset is applied to the signal to act as an intermediate receiving function.

    Applications for such quality monitoring scenarios except the pure quality reporting could be also quality-based routing or quality based billing.

    For the network operator the quality monitoring scenario can be used as a powerful quality reporting tool application, however further applications are possible like quality-based routing or quality based billing.

    Intrusive and single-ended Quality Here a test connection has to be established at both ends but it is not required that the far-end side plays back a pre-defined signal. This is an advantage as there is no need to install a dedicated answering station. The model works with any speech signal from the far-end, these could be public numbers like the weather forecast or the time service. This is really helpful for monitoring multi-link connections especially to other providers or other countries. Only at the listening side a test system has to be installed. Furthermore, the network provider will have the possibility to monitor there own voice-based announcement services for possible impacts or accessibility.

    NiNA+ is SwissQuals solution for smart predicting MOS-LQO on a single ended approach. It covers a signal pre-processing and calculates additional parameters such as causes of quality degradations, noise and speech levels. NiNA+ as stand-alone solution is a complete suite for non-intrusive listening quality assessment.

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 3 | NiNA+ Network Quality Assessment CONFIDENTIAL MATERIALS

    5

    3 NiNA+ Network Quality Assessment

    Introduction Why NiNA+ SwissQual has developed NiNA already in 2001. The main structures of NiNA form also an integrated part in ITU-T P.563, which was developed in a joint process and was approved in 2003. However, ITU-T P.563 is a very complex model, which doesnt allow to be integrated in low-performing platforms such as mobile operating systems or DSP solutions.

    Due to the progress in the transmission technologies and the experiences made SwissQual decided to re-construct their own single-ended model widely. Since, the used methods of NiNA as well as the performance were improved significantly, the developed solution were renamed into NiNA+. It shows the relationship within SwissQuals family of measurements but signalizes also the step forwards.

    Like NiNA also NiNA+ is predicting a MOS value on the well-known 1 to 5 point scale. NiNA+ takes into account the full range of distortions occurring in public switched telephone networks and that is able to predict the speech quality on a perception based scale MOS-LQO according to ITU-T Recommendation P.800.1.

    In addition NiNA+ re-uses and extends the so-called cause-analysis, which gives detailed information about the reason of a quality degradation in a technical manner. New in NiNA+ is also a signal classification. Thus, NiNA+ itself can decide whether the signal is speech or not. NiNA+ includes further a plausibility check of the signal to be evaluated. Consequently, mis-predictions are avoided in case of signals, which are not fulfilling the requirements such as silence or non-speech signals.

    Of course, NiNA+ is providing additional information about the speech signal such as speech and noise level, interruptions and clippings as it should be expected from single ended measurement approaches.

    Technical Background of NiNA+ As mentioned in the previous chapter, in comparison to SwissQuals SQuad-LQ (a so-called double-ended method) that compares a high quality reference signal to the degraded signal on a basis of a perceptual model, NiNA+ predicts the Listening Quality without any knowledge about the input reference signal.

    The NiNA+ approach could be visualized as a human expert who is listening to a real call with a test device like a conventional handset into the line in parallel. This visualization is also the main application and allows the user to rate the scores gained by NiNA+.

    After filtering excluding signal parts outside of the telephone band, the active voice parts are assigned. Based on this voice activity detector (VAD), the signal and noise level is calculated.

    The following analysis is detection and scoring the unnaturalness of the speech. Therefore, models and expectations on human speech signals are used. Furthermore, interruptions, clippings, saturations and bandwidth limitations are analysed.

    Finally, a set of quality describing characteristics are calculated and mapped into the MOS-LQO.

    Based on these characteristics also the cause analysis and the signal classification is done.

    Technical requirements and performance SwissQuals NiNA+ solution runs on Windows 32bit platform. It requires only a speech signal with 8000 Hz sampling frequency as input. Because of SwissQuals consequent run time optimization, it requires only 0.25% of the speech sample duration for the complete calculation on a state of the art Pentium 4 processor (2.6 GHz)1. For comparison, it runs nearly 100 times fast than ITU-T P.563 and even more than 20 times faster than SwissQuals speed optimized solution for P.563.

    1 Requires INTEL CPUs. 50% active speech assumed.

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 3 | NiNA+ Network Quality Assessment CONFIDENTIAL MATERIALS

    6

    SwissQuals NiNA+ solution runs also on Windows 32bit platform. This low complexity makes NiNA+ to an ideal component at low performing platforms such as mobile phone operating systems and digital signal processors.

    Furthermore, the NiNA+ method has some useful requirements on the speech signal to be assessed to avoid false predictions or malfunctions.

    Sampling Frequency:

    The sampling frequency has to be 8000 Hz and a linear quantized PC-signal (16bit) is required. The conversion from other formats is not part of the algorithm itself and has to be done separately. This process is done automatically by SwissQuals QoS measurement systems, therefore no further work needs to be done by the customer.

    Speech Sample Length:

    A sample length between 5 and 20 seconds is recommended. The signal length will be checked by SwissQuals QoS system. Defined sample length below 5 seconds will be not accepted. Sample length of above 20 seconds will result in a warning message and will be truncated at 20 seconds. It is recommended that the speech activity has to be in minimum 25%, but more than three seconds and should not exceed 90% (especially for short samples).

    Minimum Speech Activity:

    The main requirement is the minimum amount of active speech in the file. To obtain accurate results the speech signal should contain, at least 3 seconds of active speech. Otherwise, the processing might lead to wrong results, because the balance between voiced and unvoiced sections is not given anymore. Even for auditory tests with human listeners a minimum speech activity of 4 seconds is recommended. To avoid a mal-function, the configuration of the measurement probe does not allow the definition of speech sample length below 5 seconds. Nevertheless, the active speech might under-run the minimum speech activity. Consequently, SwissQuals QoS system is configured not to process speech samples with less than 3 sec active speech, instead a warning message is displayed.

    Speech Level:

    NiNA+ accepts range of active speech level from -16 dBov down to -45dBov. Higher levels will lead to annoying clippings of the higher amplitudes. However, if the high speech level is caused by the network under test, it should be considered in the quality but if the clipping is caused by measurement interface, it will lead to artificial quality impacts.

    Likewise, measurements with low speech level will have a decreasing signal-noise-ratio caused by the li-mited digital resolution of the used A/D converter in the measurement environment. This will also lead to additional quality impacts. SwissQuals QoS system will ensure the proper level adjustment for all supported cellular phones and ISDN/PSTN cards. Only in the transparent mode by using arbitrary terminals the customer it self has to control the correct level adjustment. For that reason speech levels, which are out of the recommended range, will be highlighted in red color by analyzing the results in SwissQuals NQDI data interface. Please note, that files with a speech level of below -65dBov will be not analyzed and a warning message will be displayed.

    Accuracy of predicted Listening Quality:

    The accuracy of the NiNA+ model was by using large speech databases covering the complete scope of todays public switched telephone networks.

    The performance against well-known databases from the ITU-T set is shown below. Due to the target applications from SwissQuals QoS system, a strong focus was set for an outstanding performance in real live network connections, such as the mentioned test real GSM with handset variations. The numbers are describing the correlation coefficient between the MOS values obtained in the auditory tests and the predicted scores by NiNA+. Therefore a third-order mapping was applied before calculation of the correlation. The results below are comparing the NiNA+ performance with the current ITU-T standard P.563. Table 3-1 Correlation coefficients between MOS values obtained in auditory tests and scores of NiNA+

    Speech Database ITU-T P.563 NiNA+

    Suppl. 23, Exp. 1 Am. English 0.902 0.905

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 3 | NiNA+ Network Quality Assessment CONFIDENTIAL MATERIALS

    7

    Speech Database ITU-T P.563 NiNA+

    Suppl. 23, Exp. 1 Japanese 0.842 0.918

    Suppl. 23, Exp. 3 Am. English 0.916 0.857

    Suppl. 23, Exp. 3 Japanese 0.929 0.903

    Real GSM handsets, different positions 0.895 0.925

    Real GSM Background Noises 0.935

    Real VoIP 0.950

    NiNA+ MOS scores vs. Auditiry Test results

    1.0

    1.5

    2.0

    2.5

    3.0

    3.5

    4.0

    4.5

    5.0

    1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

    Auditory Test (MOS)

    NiNA

    +

    ITU-T Suppl. 23 Exp. 1, American English

    r = 0.905

    Figure 3-1 NiNA+ Listening Quality values for noise-free speech transmissions

    This database shown in Figure 3-1 is taken from the G.729 characterization phase of ITU-T and consists of a wide range of existing codecs and combinations thereof. The results given are on a so-called per-condition basis, which means the results of four samples transmitted through the same application scenario were averaged.

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 3 | NiNA+ Network Quality Assessment CONFIDENTIAL MATERIALS

    8

    NiNA+ MOS scores vs. Auditiry Test results

    1.0

    1.5

    2.0

    2.5

    3.0

    3.5

    4.0

    4.5

    5.0

    1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

    Auditory Test (MOS)

    NiNA

    +

    Real GSM handsets, different positions

    r = 0.925

    Figure 3-2 NiNA+ Listening Quality values in GSM connections using real handsets

    This database shown in Figure 3-2 is taken from a subjective test performed or ITU-T within the P.563 competition phase. It was organized by SwissQual in the Deutsche Telekom Laboratories in Berlin. Compared to the common ITU-T databases, where simulated speech files are used this test contains speech recordings in real GSM circuits. The speech signals were inserted in the handset microphone using an artificial mouth in different acoustical environments.

    Measurement results of NiNA+ The following figures and results were taken from SwissQuals post-processing tool NDQI. However the same set of results will be supported by applying SwissQuals NiNA+ solution in other environments.

    After the measurement results are imported into NQDI the analysis of the results can be done as shown in Figure 3-3. Here a complete overview about all of the obtained results is given. In addition to the calculated parameters also the signal envelope as well as the signal in the dime domain is graphically presented.

    Please note that sequences without or to less speech activity will be also analyzed but they will be signalized separately within SwissQuals QoS systems and instead of the results the information Silence or Speech Activity too low will be presented.

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 3 | NiNA+ Network Quality Assessment CONFIDENTIAL MATERIALS

    9

    Figure 3-3 Example of NiNA+ measurements shown in NQDI

    Typically, of most interest to the users is the Listening Quality value gained by Figure 2-1 applying NiNA+. In line with ITU-T Recommendation P.800.1 it is called MOS-LQO where the LQO stands for Listening Quality Objective. The MOS-LQO is defined in range 1 to 5 where 1 is standing for bad and 5 for excellent speech quality. In real measurements, the value will scarcely exceed 4.5.

    In addition to the MOS-LQO, further analysis can be done by analysing the average section as shown in Figure 3-4.

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 3 | NiNA+ Network Quality Assessment CONFIDENTIAL MATERIALS

    10

    Figure 3-4. Average NiNA+ results

    The following values are presented in an average section:

    MOS-LQO provided by NiNA+ Speech Level in dB OVL Noise Level in dB OVL Static SNR in dB Amplitude Clipping in % Speech Activity in % DC Offset in % Pitch frequency in Hz Main Signal Distortion Signal Class

    The MOS-LQO is truly the main result of the analysis and gives an overview about the quality in a single number result. To give a bit more feeling about the results, which can be expected, the following table lists results obtained by analyzing coded speech with typical speech codecs.

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 3 | NiNA+ Network Quality Assessment CONFIDENTIAL MATERIALS

    11

    Table 3-2 Typical MOS values of auditory tests and NiNA+

    Codec Typical MOS-LQS (Auditory Test) Typical MOS-LQO (NiNA+)

    G.711 4.3 4.4

    G.729 3.8 3.8

    G.728 3.7 3.7

    G.726 (32kbit/s) 3.9 3.8

    GSM-FR 3.5 3.2

    GSM-EFR 3.9 3.8

    Speech Activity is a ratio

    Number of speech frames / Total number of frames * 100

    in percentage. If this value is 50 % then the number of speech active frames equal to the number of silent frames. The higher this number is the higher is a speech density in an input signal. As mentioned above, NiNA+ can deal with a range of 20 to 90%. A minimum amount of 3s active speech is required for both approaches. The Speech Activity as well as the Speech Level will be calculated by internal voice activity detection, the results are similar, but not identical, to ITU-T P.56 Active Speech Level.

    Speech Level shows the R.M.S. level of all frames containing active speech. Because silent intervals and speech pauses will be not considered, it is a good measure for the actual speech level control in the channel. The Speech Level is presented in dB rel. to the Overload Point (32768 for 16Bit quantization) and is close to Active Speech Level according ITU-T P.56.

    Noise Level is an estimation of the background noise floor. It is mainly calculated by the noise occurring in speech pauses. The Noise Level is the r.m.s. in dB rel. to the Overload Point (32768 for 16Bit quantization) and is spectral un-weighted (linear filter response for calculating) except a weak telephony bandpass.

    Static SNR gives brief information about the signal-to-noise ratio of the signal. Here the ratio between the active speech and the estimated background noise floor is calculated.

    Pitch Frequency is a value which represents a pitch frequency of the input signal in Hz. The Pitch Frequency in case of speech signal is the fundamental oscillation of the talkers vocal tract. Typical pitch frequencies for female speakers are in the range 100 to 200 Hz and for male speaker in a range 50 to 150 Hz. Even the Pitch Frequency is out of the telephone pass-band, it can be recovered by analyzing harmonic oscillations in upper frequencies.

    DC Offset: This number shows a constant value of the input signal in percentage. Human ear can not perceive a DC Offset. The DC-Offset will not influence the quality score because it will neither be transmitted by the transducer in the terminal nor perceived by a human ear. But a certain amount of DC Offset (>0.5%) signalizes problems in the terminal interface or in the transmission channel itself.

    Amplitude Clipping: The latest versions of NiNA+ present the Amplitude Clipping as a separate value. In this case the corresponding label is enabled. This value describes roughly an estimated amount of amplitude clipping. Since, no reference signal is available and the hard saturation in the time signal might be affected by filtering, hence, this figure will only react on severe detectable clippings.

    Signal Class classifies the analysed signal into

    Clean speech Noisy speech No speech

    In case of No Speech, no MOS-LQO is calculated but the signal level.

    Problem code shows a possible cause for the speech degradation.

  • NiNA+Voice Measurement Description Manual

    2000 - 2012 SwissQual AG

    Chapter 3 | NiNA+ Network Quality Assessment CONFIDENTIAL MATERIALS

    12

    It is possible to see more then one cause (code) in the average section. There are eight different problem codes:

    Background noise is signalized if the Noise Level is higher than -50 dB or the static SNR is below 20 dB. Modulated Noise occurs when the segmental SNR is under-run a defined multi-dimensional threshold. It

    signalizes mainly signal-form speech codecs. Interruptions flag is set to true if one or more signal interruptions are detected in a speech signal Level problem occurs if the signal level exceeds the nominal level for more then 10 dB. Likewise, this

    problem will be also signalized if the signal level will fall 12dB below nominal level. Nominal speech level is -26 dBov (dB to digital overload point).

    DC Offset problem is shown when the DC offset of speech signal has exceeded the predefined thresholds of +/- 0.2 %.

    Amplitude clipping is shown if the saturation of the signal will lead to significant distortions. Restricted Audio Bandwidth is flagged if there a significant limitation relatively to the expected telephone

    band (3003400) can be detected. NotSpecified signalizes that the speech quality is degraded but no outstanding reason for that

    degradation could be classified OK shows that the speech quality is nearly non-degraded Silence and LowSpeechActivity are also signalized, but no MOS-LQO is calculated

    The next step in the analysis is done by looking at the signal envelope as well as by listening to the live recordings.

    Analyzing Envelope of Received Signal:

    The signal envelope is graphically presented. It provides the experienced user with visual charts information on amplitude clippings, background noises and interruption. Especially the locations of interruptions are marked separately by vertical lines. At the top of the line the detected length of the interruption is printed in ms (Figure 3-5).

    Time Domain

    Envelope Interruptions

    Time [s]6.005.805.605.405.205.004.804.604.404.204.003.803.603.403.203.002.802.602.402.202.001.801.601.401.201.000.800.600.400.200.00

    Enve

    lope

    [dBo

    v]

    0

    -10

    -20

    -30

    -40

    -50

    -60

    -70

    -80

    -90

    82 ms 107 ms 71 ms

    0.00

    Figure 3-5 Signal Envelope [dB] (Received Speech Signal)

    The envelope below presents the signal in the common time domain format (Figure 3-6). Also here the experienced user can obtain some information as peaks and amplitude clippings.

    Coded Sample

    Time [s]6.005.805.605.405.205.004.804.604.404.204.003.803.603.403.203.002.802.602.402.202.001.801.601.401.201.000.800.600.400.200.00

    Leve

    l

    30'00025'00020'00015'00010'000

    5'0000

    -5'000-10'000-15'000-20'000-25'000-30'000

    0.00

    Figure 3-6 Time Domain Chart (Received Speech Signal)

    Furthermore, the NQDI presentation sheet gives the possibility to play back the received sample by using the default or a specified audio player as well as several options to export the results into external tables or text documents.

    NiNA+Voice Measurement Description1 Introduction2 Listening QualityIntroductionThe Definition of Listening QualitySubjective and Objective Quality assessmentAssessment of Intrusive-/Non-Intrusive Calls

    3 NiNA+ Network Quality AssessmentIntroduction Why NiNA+Technical Background of NiNA+Technical requirements and performanceMeasurement results of NiNA+