Phonetics

Phonetics

The Creation of Speech Sounds

Anatomy and phonetics

• Speech sounds are created by air pushed out of the lungs; air vibrates as it passes through the vocal tract

• Different positions of the vocal folds, the tongue, lips, and other articulators in the mouth modify the air causing different speech sounds

Periodic vibrations

• When a tuning fork vibrates, it creates successive waves of compression and rarifaction among air molecules.

Periodic vibrations

• We can plot these changes in density as sine wave where the horizontal axis represents time and the vertical axis represents density of air molecules.

Sine Wave

Amplitude

Time

Phase

• Two waves can have the same frequency, but different phase. Combination of two waves can result in an increase in amplitude if they are in phase, or a decrease if they are out of phase.

Amplitude

• Two waves can have the same phase, but different amplitudes.

Complex waves

• The combination of sine waves can result in a complex wave with virtually any shape.

Complex waves

• The addition of higher harmonics creates a sawtooth shape to this wave.

Fourier analysis

Power spectrum of sine wave

Amplitude

Frequency

Power spectrum

• This graph shows the fundamental frequency, and the harmonic frequencies (whole number multiples of F0) associated with that fundamental.

Aperiodic vibration (noise)

• Noise is characterized by random movement of air molecules. There is no patter to the vibrations, hence there is energy at all frequencies of sound.

Speech waveform

Source-filter model

Sagittal section of the vocal tract (Techmer 1880).

Lungs

Trachea

Vocal Folds (within the Larynx)

Pharynx

Nasal Cavity

text ©J.J. Ohala, September 2001

Source-filter model

Source-filter model

• The lungs provide the power.

• The larynx is a valve.• Air forced through the

larynx causes it to vibrate (Bernoulli’s principle).

• The vibrations resonate in the upper cavities.

Anatomy: the larynx

• The larynx is a highly complex structure that houses the vocal folds. It evolved out of the cartilagenous rings that structure the trachea.

Anatomy: the larynx

• The larynx has two levels of protection for preventing objects from passing into the trachea.o Epiglottiso Vocal folds

Anatomy: the larynx

• A complex set of muscles control the movement of the vocal folds (which are themselves muscles).

Anatomy: the vocal folds

• Vibration of the vocal folds.o Bernulli’s Principle.

Vocal fold sequence

Glottal waveform

The filter (vocal tract)

• The vocal tract can cause an increase in amplitude of some harmonics.

Orangutan vocal tract

• Compare the vocal tract of the Orangutan. Not much volume for resonance to occur in.

Rhesus monkey v-t

• The tongue takes up most of the space in the v-t of non-human primate species.

Homo sapiens v-t

• Notice the “L” shape of the v-t. Humans choke to death far more frequently than other species.

Resonance cavity (vocal tract)

• Position of the tongue creates sub-spaces for resonance of the vocal tone.

The vocal tract

Filter Function for Schwa Vowel

Power Spectrum of Glottal Tone

Output of Vocal Tract Filter Function

Vocal Tract Filter

Filter Function for Specific Vowels

Vowel formants for English

F1 F2

/i/ 290 2500

/æ/ 690 1650

/a/ 710 1200

/u/ 310 900

American Vowels

Other Vowel Systems

Consonants

• Compare the musculature of non-human primates with respect to the articulation of speech sounds.

• Humans have far more specific control over the facial muscles that control the lips and jaw.

The modularity of Speech

• Two aspects of modularity

language vs. general cognitive system

linguistic subsystems• The importance of modularity of speech• Two supporting evidences of modularity of

speech

-- problem of invariance

-- categorical perception

Two supporting evidence for speech as a modular system

1). Problem of invariance

• The relationship between acoustic stimulus and perceptual experience is complex in the case of speech. The fact that there is no one-to-one correspondence between acoustic cues and perceptual events has been termed the lack of invariance.

Why is the question of modularity important?

It is related to the question of the organization of the brain for language language development / disorders.

If speech is a modular system a specialized neurological representation. not be based on general cognitive

functioning (working memory, episodic memory, and so on) but would be specific to language.

the basis for the perception of language in young infants and, if damaged, the reason that certain individuals suffer quite specific breakdowns in language functioning.

The phoneme /t/ and its allophones

• [t] as in [th] as in [ɾ] as in [ʔ] as in …• stop top little kitten• acoustic acoustic acoustic acoustic

acoustic• pattern 1 pattern 2 pattern 3 pattern 4

pattern 5

Perception of /t/

Why does the problem of invariance support the speech as modular system hypothesis?

• When hearing the physical sound [d], how does a listener quickly solving the problem of its real identity (e.g. /t/ or /d/?)

this is more complex than ordinary auditory perception

speech is a special mode of perception.

Two supporting evidence for speech as a modular system

• 2).Categorical perception of initial consonants: based on VOT( Are there any difference between language and other cognitive functioning such as vision? )

To comprehend speech, we must impose an absolute (or categorical) identification on the incoming speech signal rather than simply a relative determination of the various physical characteristics of the signal.

auditory cues such as frequency and intensity will play a role, but ultimately the result of speech perception is the identification of a stimulus as belonging to one or another category of speech sound.

VOT—voice onset time

• In the case of oral stops, the airflow is blocked completely, causing pressure to build up. The obstruction in the mouth is then suddenly opened; the released airflow produces a sudden impulse in pressure causing an audible sound.

• VOT is relative to the stop release burst.

VOT – voice onset time

• On a speech spectrogram it is possible to identify the difference between the voiced sound [ba] and the voiceless sound [pa] as due to the time between when the sound is released at the lips and when the vocal cords begin vibrating.

• With voiced sounds, the vibration occurs immediately; however, with voiceless sounds it occurs after a short delay. This lag, the voice onset time, is an important cue in the perception of the voicing feature.

Education

Phonetics