Smart Volume Tuner for Cellular Phones - DCA | FEEC · used for tuning the Input/Output scaling factors, membership functions and optimize the fuzzy ... SUTHIKSHN KUMAR EVOLUTIONARY

Evolutionary Fuzzy Volume Tuner for Cellular Phones

Suthikshn Kumar

Communication and Embedded Systems Division, Larsen & Toubro Infotech Limited,

4th Floor, #2, Church Street, Bangalore 560001, India Web: www.lntinfotech.com

Email: [email protected]

ABSTRACT

This paper proposes the use of Evolutionary Fuzzy Volume Tuner(EFVT) for 2G, 2.5G and 3G cellular phones based on fuzzy logic concept for improving the voice quality in the presence of background noise. The EFVT makes use of the noise level and class information generated by a system for fuzzy pattern classification of background noise. The EFVT is personalized by using the audiogram to design the fuzzy rule base. Evolutionary algorithm is used for tuning the Input/Output scaling factors, membership functions and optimize the fuzzy rule-base of the volume tuner. The design and simulation of the fuzzy volume tuner is discussed along with the implementation details. FuzzyControl++ tool is used for the simulation of EFVT.

1. INTRODUCTION Whenever we are having a conversation on phone, if the background noise level is high, we either ask the speaker at the other end to speak loudly or we may increase the volume. Also, during high background noise levels, the users tend to bring the mobile very close to their ears. The QoS is improved by providing an Evolutionary Fuzzy Volume Tuner in the cellular phone which intelligently changes the volume level based on the background noise levels and classes. The background noise levels can be high while in busses, trains, planes, markets, sporting venues etc. Several methods have been investigated for background noise classification [13][14]. Background noise classification information is very useful and can be used for dynamically adapting the acoustic volume levels to suit the particular type of noise. To improve the understandability of sounds in different noise environments, volume levels can be automatically adjusted. Car, bus and train noises fall into low-frequency noise category. The spectrum of some classes of noise remain constant in time( stationary noise) whereas others vary suddenly( non-stationary). The EFVT for mobile phones, adjusts the volume according to the background noise level and noise class by a fuzzy system. The inputs to the fuzzy volume controller are the noise level derived from the Voice Activity Detector (VAD) present within the speech codec, the current volume level and the noise class derived from a system for fuzzy pattern classification of background noise. By intelligently adjusting the volume level, the QoS is improved for both

SUTHIKSHN KUMAR EVOLUTIONARY FUZZY VOLUME TUNER FOR CELLULAR PHONES stationary and non-stationary background noise in mobile environments. We refer to Mobile Phone which uses EFVT in this paper as Smart Cellular Phone (SCP). Hearing loss in individuals can be gradual and Quality of hearing can vary from person to person. Hearing loss can also result in difficulty for individuals in understanding speech in the presence of background noise. Hence the EFVT needs to be personalized depending the requirements of individual’s hearing requirements. For a person with hearing loss, the EFVT rule base is designed based on the audiogram. The Evolutionary algorithm is used to tune the input-output scaling factors, membership functions and also optimize the fuzzy rule-base of the Fuzzy Volume Tuner (FVT). This paper is organized as follows: In the next section, the details of the EFVT for cellular phone are presented. The section 3 gives the details of simulation results for SCP using FuzzyControl++ tool. In the section 4, we present the summary and conclusions.

2. Evolutionary Fuzzy Volume Tuner Lotfi Zadeh introduced fuzzy sets in 1965[1]. More details on fuzzy sets and fuzzy system applications can be found in literature [2, 3, 21]. Detailed psychoacoustics studies have demonstrated the dependence of loudness on frequencies and durations of the speech signals[20]. Also observed is the fact that masking by frequencies, which are closer, with lower-frequency tones masking the higher-frequency tones. Thus the intelligent adjustment of volume level depending on the background noise level and class will improve the speech quality in mobile environments.

The Smart Cellular Phone adjusts the volume according to the background noise level and noise class by an Evolutionary Fuzzy System. The inputs to the EFVT are the noise level derived from the Voice Activity Detector (VAD)[9, 10] present within the speech codec, noise class obtained from a fuzzy system for background noise classification[13] and the current volume level. We will have linguistic terms defined for background noise levels, volume levels and the volume level changes. We develop a fuzzy rule base which is used in building the fuzzy controller. The fuzzy set values for volume level change(VLC) consists of following linguistic terms :

• LN: Large Negative • MN: Medium Negative • SN: Small Negative • ZE: Zero • SP: Small Positive • MP: Medium Positive • LP: Large Positive

The volume level of a speech channel refers to average amplitude at which it generates speech. Volume levels are expressed with numerical values from 0 ( silence) to maximum value on a linear scale. The fuzzy set values for volume level ( VL) are { Very Low, Low, Normal, High, Very High}.

SEVENTH ONLINE WORLD CONFERENCE ON SOFT COMPUTING IN INDUSTRIAL APPLICATIONS (WSC7) 2

SUTHIKSHN KUMAR EVOLUTIONARY FUZZY VOLUME TUNER FOR CELLULAR PHONES 2.1 Measuring background Noise Levels Several techniques have been proposed for measuring the noise levels[ 27,28]. The technique proposed for EFVT is to use the Voice Activity Detector(VAD) which is embedded in the 2G, 2.5G and 3G cellular phones. The GSM VAD computes the noise levels during the noise-only periods. An adaptive noise-suppressor filter is used to filter the input signal frame. The co-efficients of this filter is computed during noise-only periods determined by special measures taken to identify noise-only frames. These include signal stationarity and periodicity measures. Several algorithms have been proposed to improve the performance of GSM VAD for stationary and non-stationary noises[9,10,12]. The threshold value computed by the VAD can be used as crisp noise level input for the fuzzy controller. A fuzzy set for noise level(NL) is { Very Low, Medium Low, Low, Zero, High, Medium High, Very High}. The fig 2 shows the variation of threshold value for a test speech file for different speech frames( x-axis indicates the number of speech frame).

Figure 1: Threshold(THVAD) Variation for a test speech file obtained by Voice Activity

Detector(VAD)

2.2 Obtaining Noise Classes The background noise classes (NC) are obtained by a fuzzy system[13]. This Fuzzy Noise Classifier (FNC) classifies the background noise into 7 types i.e., Stationary ( Car, Train, Bus-Dump) , Non-stationary (Street, Factory, Construction, Babble). The EFVT makes use of the class information. The fuzzy rules are tuned to improve the QoS based on the FNC output. The volume change to be applied is based on the noise class. e.g., the volume increase may not be the same for car noise and the factory noise.

Rate of change of volume control is also an important parameter; too-fast a change can interfere with the intelligibility itself [15]. Hence a comfortable volume change control rate is to be adapted based on the mobile phone user surveys.

The block diagram of the Fuzzy Volume Tuner is as shown Fig. 3.


SUTHIKSHN KUMAR EVOLUTIONARY FUZZY VOLUME TUNER FOR CELLULAR PHONES

Figure 2: Block diagram of Fuzzy Volume Tuner

The Fuzzy rule base contains IF/THEN rules such as: • If NL is high and VL is low and NC is car Then VLC is LP • If NL is low and VL is high and NC is train Then VLC is MN • If NL is very high and VL is very low and NC is babble Then VLC is LP • If NL is very high and VL is low and NC is factory Then VLC is MP • If NL is high and VL is zero and NC is bus-dump Then VLC is SP • If NL is Low and VL is high and NC is babble Then VLC is ZE • If NL is very high and VL is high and NC is car Then VLC is MP

Zero(ZE) noise level indicates the noise level during which volume level settings were made by the user to suit his comforts. The Fuzzy rule base may contain 30 to 40 rules. The rule-base may be tuned using evolutionary techniques. During the period of conversation, the surrounding noise level may vary, so the EFVT adjusts the volume. Hence for the user of the mobile phone, QoS is improved as the volume control is transparent.

2.3 Personalized EFVT Hearing impairments can vary from person to person. The speech intelligibility and quality of hearing can vary and also are dependent on the background noise. Several earlier techniques have addressed the personalization issues[33, 34] . Since EFVT is based on the Fuzzy logic, it can be personalized by tuning the fuzzy rule base to the requirements of the individual. The hearing loss is measured with an audiogram. An Audiogram of a person shows the amount of hearing loss in each of the frequency bands as shown in the fig. 4 [24, 29]. A person with hearing loss will perceive different frequencies at different levels[26].



Figure 3: Audiogram

The pitch frequency varies from person to person and thus can be perceived differently by a person with hearing loss. The pitch frequency varies over a wide range ( 50-500Hz). Also, it may slightly vary for an individual. Recently, some very successful techniques have been proposed for pitch extraction[23]. The pitch frequency is extracted from the speech signal using autocorrelation technique. This is input to the EFVT for fuzzification. The fuzzy rule base of the EFVT is tuned based on the audiogram of the person who wants to use the mobile phone.

Fuzzy Rules from the pitch based on an audiogram:

• If pitch is low and hearing loss is moderate Then volume level is high • If pitch is medium and hearing loss is mild Then volume level is Zero • If pitch is high and hearing loss is severe Then volume level is very high

The EFVT can further be extended by using the formant frequencies as inputs along with the pitch frequencies. 2.4 Evolutionary Algorithm The Evolutionary algorithm(EA) resembles the natural evolution to provide an universal optimization. Evolutionary algorithms start with a population of chromosomes which represent the various solutions. The solutions are evaluated using a fitness function and a selection process determines which solutions are to be used for competition process. These algorithms are highly successful in solving search and optimization problems. The new solutions are created using evolutionary principles such as mutation and crossover. Evolutionary algorithms have been applied to the automated design and optimization of fuzzy controllers[22, 37, 38, 39]. The evolutionary algorithm for fuzzy volume tuner performs 3 functions.:

• Tunes the input-output(IO) scaling factors



• It tunes the membership functions • Optimizes the fuzzy rule-base.

Input-output scaling factors have a big impact on the overall performance of the EFVT. Hence the Evolutionary algorithm tunes the scaling parameters before fine tuning membership functions. For the optimization of the Fuzzy rule-base, the chromosomes represent the individual rules in a technique called “Michigan approach”. The block diagram of the EFVT is shown the fig. 4.

Output

Fitness

Fuzzy Fuzzy Sets

Inputs Mobile Environment

User/Trainer Input

Fuzzy Volume Tuner

Evolutionary Algorithm

3. SIMULATION AND ESTIMATSeveral Software tools are availablFuzzyControl++ tool from Siemensprovided by them. Also, we carriimplementation of SCP.

3.1 Simulation with FuzzyControlFuzzyControl++ [6,8] is a tool whicsystem. The inputs, outputs and IF/Tfunctions are fixed and no choice iprovision for editing rules by matrixprovides an impressive 3D graphics and a simulation window enable simgenerate code for target systems. We carried out simulation of SCwindow shows the inputs, outputs a

SEVENTH ONLINE WORLD CONFERENCE ON

Figure 4: EFVT Block Diagram

ION e for the analysis and simulation of the SCP[7]. We used [ 6, 8, 11]. We used these tools to study various features ed out the evaluation of these tools for simulation and

++ h can be easily used for configuring and simulating a fuzzy HEN block are easily configured. However, the membership s available. The rules can be easily edited. There is also a . The ranges for the linguistic terms can be easily defined. It display to view the decision surface. A rule activity window ulation studies of the fuzzy system. However, this tool can

P using Siemens FuzzyControl++ tool. The configuration nd the IF/THEN rules( Fig 6). The rules are edited using a

SOFT COMPUTING IN INDUSTRIAL APPLICATIONS (WSC7) 6

SUTHIKSHN KUMAR EVOLUTIONARY FUZZY VOLUME TUNER FOR CELLULAR PHONES rule matrix. The rules in matrix form are easier to handle. The 3D graphics display shows the mapping of inputs and outputs. The screen shot of FuzzyControl++ simulation window is shown in Fig. 5. We have used sawtooth waveform as inputs to the Fuzzy system. The output of the fuzzy controller i.e., Volume Level Change is displayed.

Figure 5: FuzzyControl++ Simulation window for SCP



Figure 6: FuzzyControl++ Configuration window for SCP

The results of FuzzyControl++ simulations of SCP are as shown in fig.5. The sawtooth waveforms are used for inputs, VL and NL. In fig.5, time shifted sawtooth waveforms have been used. The VLC output from EFVT keeps increasing as the NL increases. With the falling of the NL, the VLC also decreases. In the fig.4, number of rules (16) have been used and this results in a smooth curve for the output. Fig. 7 shows a control surface of the EFVT(16 rules).

Figure 7: Control Surface for EFVT

3.2 Estimates for MIPS and Memory Here we estimate the MIPS and Memory requirements for a SAM (Standard Additive Model) Architecture for the Fuzzy controller.



• Inputs Variables=2 ( Volume Level and Noise level) • Output= 1 ( Volume Level Change) • No of Fuzzy Rules = 25 ( Approximate)

Computation steps:

• Fuzzification • Fuzzy Inference ( IF/THEN) • Weighted Summation • Centroid Defuzzification

This basic SAM structure requires only 3 simple operators: Add, Multiply and Divide. Based on the benchmark results which are published on Fuzzy Controller implementations[5] we can estimate the MIPS and memory requirements. For SCP application ( 15-25 fuzzy rules, 2 inputs, 1 output, 4-6 labels per variable), we may need less than 2Kbytes for storing the code. The execution cycles may not exceed 15,000 cycles. This is for a conventional 8 bit microcontroller such as 68HC11. The speech codec processes 20ms frames each of 160 speech samples. The speech codec outputs are used by the VAD which updates the NL for each speech frame. The response time of the EFVT is the summation of delays due to computation of NL and NC. Let the delays due to speech codec be dsc, VAD be dvad, FNC be dfnc and EFVT be dfvc. Thus the response time of the EFVT for change in the background noise level is 20ms + dsc + dvad + dfnc + dfvc. For a typical GSM Fullrate speech codec, the MIPS load including the VAD may not exceed 2-3 MIPS[25]. The MIPS load due to EFVT alone will not exceed 0.75 MIPS ( 15000 cylces in 20ms). Since changing the volume level rapidly will degrade the quality of speech, a volume hangover technique similar to VAD hangover is to be adapted. This is to adjust the volume level only if there is a change required over a period of several consecutive speech frames. The NC may not change over a long period of phone conversation. FNC processing delay does not exceed 20ms[14]. Hence for a 100MIPS DSP processor, the EFVT response time will be less than 80ms. For implementation, one may proceed by developing the required C code( less than 1KLOC) or use Fuzzy development tools such as Siemens FuzzyControl++. After simulation studies of the model such as sensitivity analysis, one needs to work on optimizing the code for a particular processor which is used in the baseband chip of the mobile handset. The next step will be to test it on a real mobile handset and improve the Fuzzy rule set for better performance. Evolutionary algorithm is applied for fuzzy system design and optimization. The evolutionary algorithm tunes the IO scaling factors, membership function and optimizes the fuzzy rule-base.

3.3 Quality of Speech For the QoS improvement measurement while using the SCP, Mean Opinion Score(MOS) and Degradation Mean Opinion Score (DMOS) are the most preferred subjective techniques to be used[16, 17]. MOS and DMOS are five level grading scales to measure how the speech signal


SUTHIKSHN KUMAR EVOLUTIONARY FUZZY VOLUME TUNER FOR CELLULAR PHONES qualities are perceived. Scales used in MOS and DMOS are { (1, Bad, Very annoying), ( 2, Poor, Annoying), (3, Fair, Slightly Annoying), (4, Good, Audible, but not annoying), (5, Excellent, Imperceptible)}. For objective speech quality measurements, ITU-T has introduced Perceptual Speech Quality Measurement (PSQM)[18]. Tools such as Opera are available for automated testing of speech quality[19]. Another most important technique to find the suitability of the SCP is to perform the field test i.e., test it in a real environment. The figure 8 and 9 show example speech signal waveforms in the presence of background noise using the tool praat[35] and SFS[36]. The pitch and the power are also plotted. The speech signal is mixed with varying levels of background noise for analysis.

Figure 8: Speech in the presence of Background Noise



Figure 9: Speech with and without Car Background Noise

4. SUMMARY AND CONCLUSIONS In this paper, we have a proposed the use of EFVT in cellular phones. Such a SCP will have several benefits: • Improved QoS for stationary and non-stationary noise in mobile environments. As the EFVT

uses the information on background noise level and class to adjust the volume level. • Some classes of noise such as car noise fall into low-frequency noise. They do not affect the

intelligibility of speech compared to noise classes such as factory noise. Hence the EFVT has to be dependent on noise classes for effective volume adjustments.

• The EFVT is easily embedded in the mobile handset as it has very less memory and computational requirements. The computations are carried out by the microcontroller within the baseband chip.

• The EFVT can be personalized based on the audiogram for a hearing impaired person. • The fuzzy controller is extended with a Evolutionary-fuzzy system which learns new rules

and improve and optimize its performance.

Details of the SCP design along with the implementation requirements have been presented. The SCP has been simulated using simulation tool i.e., FuzzyControl++. The simulation results show that the EFVT successfully adjusts the volume levels based on the background noise level. Acknowledgement: Initial part of this work was sponsored by Infineon Technologies India Pvt. Ltd.


SUTHIKSHN KUMAR EVOLUTIONARY FUZZY VOLUME TUNER FOR CELLULAR PHONES References:- [1] Zadeh L.A., “Fuzzy Sets”, Information and Control, 8(3), 1965, pp. 338-353. [2] T. Ross, “Fuzzy Logic with Engineering Applications”, McGraw Hill International, 1997. [3] B. Kosko, “Fuzzy Engineering”, Prentice Hall , 1997. [4] D. Driankov, H Hellendoorn and M. Reinfrank, “An Introduction to Fuzzy Control”, Narosa Publishers, 1993. [5] Aptronix Fuzzy Logic Benchmarks: http://www.aptronix.com/fide/benchmarks.htm [6] Siemens AG. FuzzyControl++: http://www.ad.siemens.de/fuzzycontrol/html_76/index.htm [7] S.Kumar, “Computer Aided Fuzzy System Design and Simulation”, IEEE VLSI Design And Test Workshop (VDAT’02), August 2002. [8] Siemens AG, “FuzzyControl++ Cookbook – Recipes for easy applications of Fuzzy Logic”, 1999. [9] K. El-Maleh and P. Kabal, “Comparison of Voice Activity Detection Algorithms for Wireless Personal Communication Systems”, IEEE Canadian Conference on Electrical and Computer Engineering, pp. 470-473, May 1997. [10] E. Nemer, R.Goubran and S. Mahmoud, “Robust Voice Activity Detection Using Higher-Order Statistics in the LPC Residual Domain”, IEEE Trans on Speech and Audio Processing, Vol. 9, No. 3, March 2001, pp. 217-231. [11] Siemens AG, ECANSE – User Manual, 1998. [12] F. Beritelli, S. Casale, A. Cavallaro: " A Robust Voice Activity Detector for Wireless Communications using Soft Computing" IEEE Journal on Selected Areas in Communications (JSAC), Special Issue on Signal Processing for Wireless Communications, vol. 16, N. 9, december 1998. [13]F. Beritelli, S.Casale, P.Usai, “Background Noise classification in Mobile Environments using Fuzzy Logic”, contribution ITU-T (WP 3/12), Meeting on “Noise aspects in evolving networks”, Geneva, April 1997. [14] F. Beritelli, S.Casale, “Background Noise Classification in Advanced VBR Speech Coding for Wireless Communications”, Proc. 6th IEEE International Workshop on Intelligent Signal Processing And Communication Systems (ISPACS’98), Melbourne, Australia, 4-6 Nov. 1998, pp. 451-455. [15] M. Marzinzik, “Noise Reduction Schemes for Digital Hearing Aids and their use for the hearing impaired”, Doctoral Dissertation, University of Oldenburg, Dec 2000. [16] S. Lemmetty, “ Review of Speech Synthesis Technology”, Master’s Thesis, Helsinki University of Technology, Finland, March 1999. [17] ITU-T Rec.P.800 (08/96) Methods for subjective determination of transmission quality. [18] ITU-T Rec.P.861(02/98) Objective quality measurement of telephone-band (300-3400 Hz) speech codecs [19] M.Keyhl et al., “A Combined Measurement Tool for the Objective, Perceptual based Evaluation of Compressed Speech and Audio Signals”, AES 106th Convention, May 1999, Munich (Germany). [20] B. Gold and N. Morgan, “ Speech and Audio Signal Processing: Processing and Perception of Speech and Music”, John Wiley and Sons Publishers, 2000. [21] Special issue on industrial innovations using soft computing, Proceedings of the IEEE, Vol 89, Issue 9, Sept 2001. [22] F. Hoffman, “Evolutionary Algorithms for Fuzzy Control System Design”, Proc of the IEEE, Sept 2001, pp. 1318-1333. [23] T. Shimamura and H. Kobayashi, “ Weighted Autocorrelation for Pitch Extraction for Noisy Speech”, IEEE Trans on Speech and Audio Processing, Vol.9, No.7, Oct 2001, pp. 727-730. [24] American Academy of Audiology, www.audiology.org [25]M.H. Weiss, U. Walther and G.P. Fettweis, “A Structural Approach For Designing Performance Enhanced Signal Processors: A 1-MIPS GSM Fullrate Vocoder Case Study”, ICASSP’97. [26] S. Launer, “Loudness Perception in Listeners with Sensorineural Hearing Impairment”, PhD Thesis, University of Oldenburg, 1995. [27] Zwicker, E. and Fastl, H., “Psychoacoustics – Facts and Models”, Springer, Berlin, 1990. [28] ISO 532 (1975), Acoustics -Method for Calculating Loudness Level. [29] ISO 8253-1:1989, Acoustics -- Audiometric test methods -- Part 1: Basic pure tone air and bone conduction threshold audiometry [30] ISO/TR 3352:1974, Acoustics -- Assessment of noise with respect to its effect on the intelligibility of speech. [31] B. Kostek and A. Czyzewski, “Employing Fuzzy Logic and Noisy Speech for Automatic Fitting of Hearing Aids”, 142nd ASA Meeting, Fort Lauderdale, FL, Dec 2001. [32] T. Thiede, “Perceptual Audio Quality Assessment using a Non-Linear Filter Bank”, PhD Thesis, Technical University of Berlin, 1999.


SUTHIKSHN KUMAR EVOLUTIONARY FUZZY VOLUME TUNER FOR CELLULAR PHONES [33] B. Edwards, “Signal Processing, Hearing Aid Design, and the Psychoacoustic Turing Test”, 2002 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2002), May 13-17, 2002, Florida(USA). [34] B. Edwards, “Application of Psychoacoustics to Audio Signal Processing”, 35th Asilomar Conference on Signals, Systems and Computers, Nov 2001, Pacific Grove(USA). [35] Praat Web: www.praat.org [36] Speech Filing System (SFS) http://www.phon.ucl.ac.uk/resource/sfs/ [37] Karr and Gentry, “Fuzzy Control of pH using genetic algorithms”, IEEE Trans. Fuzzy Systems, Vol.1., No.1., 1993, pp. 46-53. [38] Takagi and Hayashi, “NN-driven Fuzzy Reasoning”, Int. J. Approximate Reasoning, Vol. 5, 1991, pp. 191-212. [39] U. Bodenhofer and F. Herrera, “Ten Lectures on Genetic Fuzzy Systems”, Technical Report SCCH-TR-0021. [40] S. Kumar, “Smart Acoustic Volume Controller for Mobile Phones”, AES 112th Convention, Munich, Germany, May 2002.


Documents

Smart Volume Tuner for Cellular Phones - DCA | FEEC · used for tuning the Input/Output scaling factors, membership functions and optimize the fuzzy ... SUTHIKSHN KUMAR EVOLUTIONARY