47
Speech acoustics Objectives: Describe relative frequency and intensity of phonemes by voice, manner, and formant frequency. Describe various phonemic cues. Describe speech constraints.

Speech acoustics

Embed Size (px)

Citation preview

Page 1: Speech acoustics

Speech acoustics

Objectives: Describe relative frequency and intensity

of phonemes by voice, manner, and formant frequency.

Describe various phonemic cues.Describe speech constraints.

Page 2: Speech acoustics

Average speech intensity

~65 dB SPL (~45 dB HL) 30 dB range Any vowel has more power than any

consonant

Page 3: Speech acoustics

Average speech frequency

~50 – 10,000 Hz Most energy below 1000 Hz

Fundamental frequency Men: 100 Hz Women: 200 Hz Children: 300 Hz Crying babies: 500 Hz

Cues for talker identity

Page 4: Speech acoustics

Average speech duration

Vowels: 130 – 360 msec Consonants: 20 – 150 msec Rate: ~5 syllables/second; ~12

phonemes/second

Page 5: Speech acoustics

Vowel formants

High F1

Low F2

High F1

High F2

Low F1

Low F2

Low F1

High F2

Page 6: Speech acoustics

Vowel formants

Page 7: Speech acoustics

Consonants: place, manner, voicing

w

Page 8: Speech acoustics

Consonants: energy bandsFrequency Bands

Consonant 1 2 3 4 Intensity

r 600-800 1000-1500 1800-2400 46

l 250-400 2000-3000 43

sh 1500-2000 4500-5500 41

ng 250-400 1000-1500 200-3000 41

ch 1500-2000 4000-5000 38

n 250-350 1000-1500 2000-3000 37

m 250-350 1000-1500 2500-3500 35

th (ð) 250-350 4500-6000 34

t 2500-3500 34

h 1500-2000 32

k 2000-2500 34

j 200-300 2000-3000 36

f 4000-5000 34

g 200-300 1500-2500 33

s 5000-6000 32

z 200-300 4000-5000 31

v 300-400 3500-4500 31

p 1500-2000 30

d 300-400 2500-3000 29

b 300-400 2000-2500 29

th (θ) ~6000 28

Page 9: Speech acoustics

Phonemic cues - Stops

Closure Voiceless stops – silent period Voiced stops – low level energy

Burst Wide-band energy ~40 msec Greater intensity for voiceless stops Frequency depends on place

Formant transition First formant always rising Second formant transition depends on

place

Page 10: Speech acoustics

Phonemic cues - Stops

Voice easier to detect than place For voiced stops

Voice-onset time is earlier Energy present at fundamental frequency Burst energy is lower in amplitude Vowels are longer in duration before voiced

final stops (“eyes” v. “ice”)

Page 11: Speech acoustics

Phonemic cues - Nasals

Always voiced Continuant Nasal resonance

highest for /m/ lowest for /n/

Second formant (frequency and transition) gives place information

Page 12: Speech acoustics

Phonemic cues - Fricatives

Hissing quality Voiced fricatives

Periodic Lower frequency Lower amplitude Greater overall energy (from

fundamental) Sibilants (s, z, sh, zh)

Higher amplitude than other fricatives

Page 13: Speech acoustics

-f- -θ- -s- -S-

Page 14: Speech acoustics

Suprasegmental cues

Stress changes in fundamental frequency,

intensity, duration Intonation

changes in fundamental frequency, pitch pattern

expresses attitudes, feeling, meaning (command, request, statement)

Duration variations in speech sounds due to

context of other sounds

Page 15: Speech acoustics

Speech constraints

Syntactic S = NP (Aux) VP

NP = (Det) (AP) N (PP) “the naughty boy in the daycare…”

VP = V (NP) (PP) (Adv) “…took the toy away brusquely”

Page 16: Speech acoustics

Speech constraints

Syntactic S = NP (Aux) VP

NP = (Det) (AP) N (PP) “the naughty boy in the daycare…”

VP = V (NP) (PP) (Adv) “…took the toy away brusquely”

Page 17: Speech acoustics

Speech constraints

SyntacticThe question “What should you eat”

Answer is a noun phrase

The question “How should you eat” Answer is an adverbial phrase

Page 18: Speech acoustics

Speech constraints

Semantic Words in a sentence are related

meaningfully “Plug the mouse into the computer”

Situational Conversation usually refers to the context

of the environment “I like that oat!”

Mall vs. Farm

Page 19: Speech acoustics

Overlapping cues help protect the signal from noise

Speech predictability helps protect the signal from noise

Noise can come from the speaker (poor intelligibility, etc) the environment (distractions, etc) the listener (ESL, etc)

Page 20: Speech acoustics

Effects of hearing loss on speech perception

Objectives: Describe speech characteristics that are

lost and that are preserved for hearing losses of various degree, type and configuration.

Page 21: Speech acoustics

0 20 50 100 200 500 1000 2000 5000 10000 200000

20

40

60

80

100

120

140

160

Auditory Response Area

Page 22: Speech acoustics

0 20 50 100 200 500 1000 2000 5000 10000 200000

20

40

60

80

100

120

140

160

Auditory Response Area

Page 23: Speech acoustics

0 20 50 100 200 500 1000 2000 5000 10000 200000

20

40

60

80

100

120

140

160

Auditory Response Area

Page 24: Speech acoustics

Speech audiogram

Page 25: Speech acoustics

Speech audiogram

X X X X X X

Page 26: Speech acoustics

Speech audiogram

Page 27: Speech acoustics

Consonants: energy bandsFrequency Bands

Consonant 1 2 3 4 Intensity

r 600-800 1000-1500 1800-2400 46

l 250-400 2000-3000 43

sh 1500-2000 4500-5500 41

ng 250-400 1000-1500 200-3000 41

ch 1500-2000 4000-5000 38

n 250-350 1000-1500 2000-3000 37

m 250-350 1000-1500 2500-3500 35

th 250-350 4500-6000 34

t 2500-3500 34

h 1500-2000 32

k 2000-2500 34

j 200-300 2000-3000 36

f 4000-5000 34

g 200-300 1500-2500 33

s 5000-6000 32

z 200-300 4000-5000 31

v 300-400 3500-4500 31

p 1500-2000 30

d 300-400 2500-3000 29

b 300-400 2000-2500 29

th ~6000 28

Page 28: Speech acoustics

Consonants: energy bandsFrequency Bands

Consonant 1 2 3 4 Intensity

r 600-800 1000-1500 1800-2400 46

l 250-400 2000-3000 43

sh 1500-2000 4500-5500 41

ng 250-400 1000-1500 200-3000 41

ch 1500-2000 4000-5000 38

n 250-350 1000-1500 2000-3000 37

m 250-350 1000-1500 2500-3500 35

th 250-350 4500-6000 34

t 2500-3500 34

h 1500-2000 32

k 2000-2500 34

j 200-300 2000-3000 36

f 4000-5000 34

g 200-300 1500-2500 33

s 5000-6000 32

z 200-300 4000-5000 31

v 300-400 3500-4500 31

p 1500-2000 30

d 300-400 2500-3000 29

b 300-400 2000-2500 29

th ~6000 28

Page 29: Speech acoustics

Consonants: energy bandsFrequency Bands

Consonant 1 2 3 4 Intensity

r 600-800 1000-1500 1800-2400 46

l 250-400 2000-3000 43

sh 1500-2000 4500-5500 41

ng 250-400 1000-1500 200-3000 41

ch 1500-2000 4000-5000 38

n 250-350 1000-1500 2000-3000 37

m 250-350 1000-1500 2500-3500 35

th 250-350 4500-6000 34

t 2500-3500 34

h 1500-2000 32

k 2000-2500 34

j 200-300 2000-3000 36

f 4000-5000 34

g 200-300 1500-2500 33

s 5000-6000 32

z 200-300 4000-5000 31

v 300-400 3500-4500 31

p 1500-2000 30

d 300-400 2500-3000 29

b 300-400 2000-2500 29

th ~6000 28

Page 30: Speech acoustics

Speech audiogram

Page 31: Speech acoustics

Speech audiogram

Page 32: Speech acoustics
Page 33: Speech acoustics
Page 34: Speech acoustics
Page 35: Speech acoustics

34 dots

Page 36: Speech acoustics

Correlating SII to speech

Adult values (children would be worse)

Digits easy

Words hard

Page 37: Speech acoustics

X X X X X X

Page 38: Speech acoustics

Correlating SII to speech

Page 39: Speech acoustics
Page 40: Speech acoustics
Page 41: Speech acoustics
Page 42: Speech acoustics

Deafness

No access to average speech

Page 43: Speech acoustics

Severe

Access to only loudest components of speech

Speech production High airflow rate Speech initiation at low lung volumes Poor velar control (nasality) High fundamental frequency Slow speech rate

Page 44: Speech acoustics

Moderate

Access to louder half of speech, or to loud speech

Speech production Substitutions and distortions Errors in affricate, fricatives and blends

Page 45: Speech acoustics

Slight to Mild

Access to all but the quietest components of speech

Speech production Fewer distortions/substitutions Good intelligibility

Page 46: Speech acoustics

Rising v. Sloping loss

Page 47: Speech acoustics

Rising v. Sloping loss

SII = 64 SII = 45