29
Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June- 04 © Nokia 2004 1 Radiation Directivity of Human and Artificial Speech Teemu Halkosaari 1 & Markus Vaalgamaa 2 1 Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing 2 Nokia Technology, Audio Quality e-mail: [email protected] Workshop of Wideband Speech Quality in Terminals and Networks: Assessment and Prediction, 8 th - 9 th June 2004, Mainz, Germany

Radiation Directivity of Human and Artificial Speech

  • Upload
    yagil

  • View
    24

  • Download
    0

Embed Size (px)

DESCRIPTION

Radiation Directivity of Human and Artificial Speech. Teemu Halkosaari 1 & Markus Vaalgamaa 2 1 Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing 2 Nokia Technology, Audio Quality e-mail: [email protected] - PowerPoint PPT Presentation

Citation preview

Page 1: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 20041

Radiation Directivity of

Human and Artificial Speech

Teemu Halkosaari 1 & Markus Vaalgamaa 2

1 Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing

2 Nokia Technology, Audio Quality

e-mail: [email protected]

Workshop of Wideband Speech Quality in Terminals and Networks: Assessment and Prediction, 8th-9th June

2004, Mainz, Germany

Page 2: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 20042

Agenda

1. Motivation and Background

2. Goal

3. Measurements

4. Analyzes

5. Modelling

6. Results

7. Improvements to telephonometry

8. Conclusions

Part I: WHY?

Part II: The STUDY

Part III: The OUTCOME

Page 3: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 20043

Part I

WHY?

1. Motivation and Background

2. Goal

Page 4: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 20044

1. Motivation and Background

• Artificial mouths and HATS were originally designed for narrowband speech measurements with big landline handsets.

• In past years, the size of phone has become smaller and use of headsets is becoming common.

• In future, wideband phones are coming…

• It seems that HATS and artificial mouths will be used in future, although

• we do not know exactly the directivities of artificial mouths or• human mouth especially on near field locations

• Additional interesting topic is that how does the speech content affect to the directivity?

• Extensive literature on the subject is not available• Standardization is inadequate

• ITU-T P.58 Head and torso simulator for telephonometry• ITU-T P.51 Arficial mouth• narrow band: 300-3400Hz, wide band: 150-7000Hz

Part I: WHY?

Page 5: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 20045

2. Goal

• Measure frequency responses for artificial mouths to positions where microphones of handsets and headsets are situated.

• Repeat the same measurements for a group of test subjects

• Most important part of the study is to

analyze the difference between artificial mouths and an average person.

• In practice a set of transfer functions is acquired.

Part I: WHY?

Page 6: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 20046

Part II

The STUDY

3. Measurements

4. Analyzes

5. Modelling

6. Results

Page 7: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 20047

• Two separate systems• Test subjects were measured in Helsinki University of Technology,

anechoic chamber, 8 channel parallel recording system.• Artificial mouths were measured in Nokia facilities in Salo, Finland,

anechoic chamber, Audio Precision APwin measurement system, 2 channel parallel MLS measurement.

• Probes• B&K 1/8 and ¼ inch measurement microphones• 8 Sennheiser KE-4-211-2 electret microphones• In addition, some phones: large and small handset, and accessories:

boom headset, mono and stereo headsets, were measured

3. Measurements: ArrangementsPart II: The STUDY

Page 8: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 20048

3. Measurements: Positions

• Referred to• lip plane and its perpendicular• on chest: axes from throat downwards and sidewards

• Two reference positions for transfer functions• MRP = 25 mm in front of mouth for artificial mouths• 0.5 m in front of mouth for test subjects.

• Measurement positions:

Part II: The STUDY

Near cheek In front of mouth On chest

Page 9: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 20049

3. Measurements: Setup for artificial mouths

Part II: The STUDY

• Two B&K measurement microphones.

• Two impulse responses parallel (reference and point of interest)

• Apwin measurement system, MLS

Page 10: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200410

3. Measurements: Positioners

• Helmet and chest positioner were developed to keep microphones in right positions.

Part II: The STUDY

Page 11: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200411

3. Measurements: Setup for test subjects (1)

IOtechWaveBook/516

LapTopcomputer

pre-amp E.A.A.2 channels

pre-amp E.A.A.2 channels

pre-amp E.A.A.2 channels

E.A.A. pre-amp..

8 channels

Biasing unit

. .

8 ch

anne

ls

Mixer

DisplayVHSRecorder

Camera

Loudspeaker

.

.

8 channels

Test subject or HATS

A/V-monitoring andTest Subject guidance

Data acquisition system

Microphones

Monitoring roomAnechoic chamber

LPT

Part II: The STUDY

• Parallel 8 channel audio recording at 32kHz sample rate

Page 12: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200412

3. Measurements : Setup for test subjects (2)

Part II: The STUDY

• Small anechoic chamber at Helsinki University of Technology and monitoring system

Page 13: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200413

3. Measurements: Test subjects

• Test subjects: • 8 male, 5 female, mainly young students, Finnish as mother tongue

• Speech material:• ”Kaksi vuotta sitten kävimme ravintola Gabrielissa Helsingissä ja

söimme siellä padallisen fasaania banaanilla höystettynä” (includes all Finnish phonemes)

• Translation: “Two years ago we went to restaurant Gabriel and we ate there a pot of pheasant larded with banana “

• Separate phonemes \n m s r\ and \a e i o u y ä ö\• Different speech volumes: loud, normal, and silent

Part II: The STUDY

Page 14: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200414

4. Analyzes

• Starting point:• parallel 8-channel 32kHz 16bit raw audio data.• Impulse responses in measurements for artificial mouths -> directly

converted to transfer functions.

• Phonemes and active speech were segmented manually (CoolEdit Pro, around 10000 manual marks by the way).

• All analyzes were made in Matlab environment:• SNR and Coherences to ensure reliability• Power spectra estimates• Transfer function estimates• Confidential intervals consideration for transfer functions

• FFT 1024 samples that corresponds to 32 ms, Hanning-window, 50% overlap

• Averaging and Smoothed to 1/3-octave bands• Speech was averaged weighting with coherence

Part II: The STUDY

Page 15: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200415

5. Modelling

• In addition to measurements two models were built based on acoustic field theory

• Results were validated by modelling: Sphere and piston source• Head = sphere• Mouth = radially vibrating round surface of the sphere (piston).

• An infinite baffle can be added to simulate the upper body.• Mirror source method

• Numerical implementation in Matlab

0R 0

u 0

Part II: The STUDY

Page 16: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200416

6. Results: HATS Nearfield

• Measured and modelled results for B&K 4128 HATS• Transfer functions from Mouth Reference Position (MRP) to other positions

Part II: The STUDY

150 1000 7000-25

-20

-15

-10

-5

0

Hz

dB

150 1000 7000-30

-25

-20

-15

-10

-5

Hz

dB

Transfer functions from MRP to near cheek positions:• Large handset (blue circles)• Small handset (red squares)• Boom headset (green triangles)• Corresponding models (black dashed)

Transfer functions from MRP to chest positions:• Near throat with vest (blue circles, solid)• Near throat without (blue circles, dashed)• Middle of chest with vest (red squares, solid)• Middle of chest without vest (red squares, dashed)• Corresponding models (black dashed)

Page 17: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200417

6. Results: HATS Far field (1)

• Measured results with B&K 4128 HATS

• Transfer function from MRP to positions in the front of mouth:• 10 cm (blue circles)• 25 cm (red squares)• 50 cm (green triangles)• 100 cm (violet quadrangles)• with (solid) and without vest (dotted)

Part II: The STUDY

150 1000 7000

-25

-20

-15

-10

-5

0

Hz

dB

Page 18: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200418

6. Results: HATS Far field (2)

• Measured and modelled for B&K 4128 HATS

• Transfer function for positions in front of mouth on 50 cm distance.• Vest on (blue circles)• Vest off (red squares)• Bare head, torso unattached (green triangles)• Model with the baffle (black dotted, thin)• Model without the baffle (black dasked, thick)

Part II: The STUDY

150 1000 7000-28

-26

-24

-22

-20

-18

-16

Hz

dB

Page 19: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200419

6. Results: Human – B&K HATS

• Comparison between transfer functions, referred to 0.5m in front of mouth position. Positions, colors and symbols are same as in the slide 16.

• TFHuman / TFHATS• 1/3 mouth cross-section area ratio in models

• 0 dB values indicate that there is no difference between human and HATS

• Positive values indicate that HATS is more directive, i.e. has less SPL on measured microphone positions on next to cheek or on chest.

Next to cheek On chest

Part II: The STUDY

150 1000 7000

-2

0

2

4

6

8

10

12

Hz

dB

150 1000 7000

-2

0

2

4

6

8

10

12

Hz

dB

Page 20: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200420

6. Results: B&K 4227 and Head Acoustics HMS II.3 HATS

• Comparison between transfer functions, referred to 0.5m position• TFHuman / TFHATS

• Positive values indicate that HATS is more directive, i.e. has less SPL on measured microphone positions on side.

• Transfer functions from MRP to near cheek positions: Large handset (blue circles), Small handset (red squares), and Boom headset (green triangles)

Next to cheek for B&K 4227 Next to cheek for HA HATS

Part II: The STUDY

150 1000 7000-2

0

2

4

6

8

10

12

Hz

dB

150 1000 7000-2

0

2

4

6

8

10

12

Hz

dB

Page 21: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200421

6. Results: Speech content

• Speech content changes the directivity pattern• Speech style: loud – silent etc..• Articulation: continuous phonation – natural speech• Phonemes (Smoothed Human/HATS difference for mic. pos. 1.3 below

with 95% confidence intervals)

• Mouth aperture size = steepness of the response on high frequencies

• Statistical approach: ANOVA model did not apply = speech data scattered

Part II: The STUDY

150 1000 7000-2

0

2

4

6

8

Hz

dB

TFs human vs. HATS by vowel groups

Page 22: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200422

6. Results: Reliability

• Beforehand by careful design of the measurement processes

• Reliability of human measurements:• SNR and Coherence (estimates)

• SNR on wide-band: >25dB• Coherence on narrow-band: >0.8, on wide-band: >0.65

• Normal distribution hypothesis• 95% confidential intervals on wide-band: <±0.5dB

• Reliability of HATS measurements is better than for humans i.e. very good (and thus omitted here)

Part II: The STUDY

Page 23: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200423

Part III

The OUTCOME

7. Improvements to telephonometry

8. Conclusions

Page 24: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200424

150 1000 70000

2

4

6

8

10

12

Hz

dB

150 1000 70000

2

4

6

8

10

Hz

dB

7. Improvement proposal: Equalization (1)

• Artificial mouths (especially B&K 4128) does not correspond to average test subject

• Difference curves are ”well-behaving” for B&K HATS -> A low order filter could equalize the difference near cheek.

• Example: Yulewalk recursive IIR filter design (least-squares method) was applied in Matlab for smoothed and averaged differences (B&K HATS).

1.4

1.3

1.1

Average of curves on right (blue)IIR N = 3 (red)IIR N = 7 (green)

Part III: The OUTCOME

Difference and 95% confidence limits between human and B&K HATS near cheek positions: Large handset (blue circles), Small handset (red squares), and Boom headset (green triangles)

Page 25: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200425

7. Improvement proposal: Equalization (2)

• Smoothed difference curves and 95% confidence limits between human - B&K 4227 (left) and Human - Head Acoustics HMS II.3 (right) on near cheek positions

• Large handset (blue circles)• Small handset (red squares)• Boom headset (green triangles)

Part III: The OUTCOME

150 1000 70000

2

4

6

8

10

12

Hz

dB

150 1000 70000

2

4

6

8

10

12

Hz

dB

Human - B&K 4227 Human - HA HATS

Page 26: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200426

7. Improvement proposal: Mouth size

• B&K 4128 HATS

• Positions 1.3 (left) and 1.1 (right) where measured with 3 different mouth sizes

• small, 15 mm x 11 mm, adaptor partly blocked (blue circles)• normal, 30 mm x 11 mm, adaptor (red squares)• big, 42 mm x 16 mm, mouth adaptor removed (green triangles)

Part III: The OUTCOME

150 1000 70000

5

10

15

20

Hz

dB

150 1000 70000

5

10

15

20

Hz

dB

Position 1.3 Human - B&K HATS Position 1.1 Human – B&K HATS

Page 27: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200427

7. Improvement proposal: Vest

• About 2cm thick vest should be used in B&K HATS measurements. How does it change the directivity in position on the chest?

• Let’s take off the vest and compare to average test subject (dashed lines)

• The near-chest reflection shifts to higher frequency

• Difference between human and HATS measurements on chest positions:• Near throat with (blue circles, solid) and without vest (blue circles, dashed)• Middle of chest with (red squares, solid) and without vest (red squares,

dashed)

Part III: The OUTCOME

vest on (solid)

vest off (dashed)

150 1000 7000-2

0

2

4

6

8

10

12

14

Hz

dB

4.2

4.1

Page 28: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200428

8. Conclusions

• Key results:• On high frequencies B&K HATS is too directional.• Near cheek: the difference (Human/Artificial mouth) is slighly

increasing by moving away from the lip plane. • On averate the difference between human and artificial mouth is

between 1 and 6 dB.• Near chest: the differences in chest reflections are significant.• Speech content affects the directivity.

• Reasons:• Mouth cross-section area during speech is smaller than in artificial

mouths. Speech content and mouth cross-section area are linked.• The torso and head design of the artificial mouths do not correspond

to the acoustical characteristics of a real body.

• What could be improved?• Near cheek: simple equalization was proposed.• Improvement of the structure of HATS (mouth size, torso, etc.). This

path requires more measurements and analysis.• Vest usage should always be considered carefully.

Part III: The OUTCOME

Page 29: Radiation Directivity of  Human and Artificial Speech

Teemu Halkosaari & Markus Vaalgamaa: Radiation Directivity of Human and Artificial Speech 08-June-04© Nokia 200429

Agenda

1. Motivation and Background

2. Goals

3. Measurements

4. Analyzes

5. Modelling

6. Results

7. Improvements to telephonometry

8. Conclusions

Part I: WHY?

Part II: The STUDY

Part III: The OUTCOME