The lungs serve as reservoir of air and a source energy In speaking, air is forced form the lungs through the larynx into the three main cavities: the

• The lungs serve as reservoir of air and a source energy• In speaking, air is forced form the lungs through the larynx into the three main

cavities: the pharynx, the nasal and the oral cavities• Air exits through the nose and mouth• Air can be inhaled and exhaled without much sound• To produce speech sounds, the flow of air is interrupted by the vocal cords or

by constrictions in the vocal tract (made by the tongue or lips)

The Human Voice. I. Speech production

1. The vocal organs

1

Larynx and vocal folds (cords)

•Larynx with focal folds is the major source of sound in the vocal system. •Sound is generated by the rhythmic opening and closing of the vocal folds.•Open during inhalation, closed when holding one's breath, and vibrating for speech or singing (oscillating 440 times per second when singing A4), the folds are controlled via the valgus nerve.•A person's voice pitch (fundamental frequency) is determined by the resonant frequency of the vocal folds.•The fundamental frequency is influenced by the length, size, and tension of the vocal folds. •In an adult male, this frequency averages about 125 Hz.•In an adult females around 210 Hz.•In children the frequency is over 300 Hz.•The male vocal folds are between 17.5 mm & 25 mm (0.75" - 1.0") in length.•The female vocal folds are between 12.5 mm & 17.5 mm (0.5" - 0.75") in length.

2

Vocal folds (continued)

• Vocal folds generate a sound rich in harmonics. • Harmonics are produced by collisions of the vocal folds with themselves,

by recirculation of some of the air back through the trachea, or both.• Some singers can isolate some of those harmonics in a way that is

perceived as singing in more than one pitch at the same time - a technique called overtone singing.

3

Vocal folds (continued) http://www.youtube.com/watch?v=v9Wdf-RwLcs

4

Vocal tract

The vocal tract is the cavity where sound that is produced at the sound source (larynx) is filtered.

In it consists of:• pharynx (laryngeal cavity)•oral cavity•nasal cavity

The estimated average length of the vocal tract in adult male is 17 cm and 14 cm in adult females.

5

2. Articulation of Speech

• Each syllable is made of one or more phonemes• Phonemes are either vowel or consonant• Vowels are always voiced (with vibrations of the vocal folds)• Consonants are either voiced or unvoiced

• There are 12 to 21 vowel sounds in English (depending on which speech scientist you talk to)

• Opinions vary as to whether it is a pure vowel sound rather than a diphthong (a combination of two or more vowel sounds into one phoneme)

6

Consonants are classified according to their manner of articulation:

• Plosive or stop consonants(p, b, t, etc) are produced by blocking the flow of air somewhere in the vocal tract (usually the mouth) and releasing the pressure rather suddenly

• Fricatives(f, s, sh, etc) are made by constricting the airflow to produce turbulence

• Nasals(m, n, ng) are made by lowering the soft palate to connect the nasal cavity to the pharynx and then blocking the mouth cavity at some point along its length

• Liquids (r, l) are produced by raising the tip of the tongue while the oral cavity is somewhat constricted

• Semivowel or glide consonants(w, y) are produced by keeping the vocal tract briefly in a vowel position then changing it rapidly to a vowel sound that follows

•Consonants are further classified according to their place of articulation, primarily the lips (labial), teeth (dental), gums (alveolar), palate (palatal) and glottis (glottal), and lips and teeth (labiodental)•There are 24 consonant sounds in English

7

8

3. Formants: Resonances of the Vocal Tract

(The peaks that are observed in the spectrum envelope and are independent of the pitch)

• They appear as envelopes that modify the amplitudes of the various harmonics of the source sound

9

• The formant with the lowest frequency is called f1, the second f2, and the third f3.

• Most often the two first formants, f1 and f2, are enough to disambiguate the vowel.

• Formants are the distinguishing/meaningful frequency components of human speech and of singing.

• The information that humans require to distinguish between vowels can be represented quantitatively by the frequency content of the vowel sounds.

• In speech, these are the characteristic partials that identify vowels to the listener. Most formants are produced by tube and chamber resonance, but a few whistle tones derive from periodic collapse of Venturi effect low-pressure zones.

Formants and speech

Vowel Main formant regionu 200–400 Hzo 400–600 Hz

a 800–1200 Hz

e 400–600 and 2200–2600 Hzi 200–400 and 3000–3500 Hz

Singers' formant

•Frequency spectrum of trained singers, especially male singers, has a formant around 3000 Hz. •It tends to be independent of the vowel and the pitch•This increase in energy at 3000 Hz allows singers to be heard and understood over an orchestra, which peak at much lower frequencies of around 500 Hz. •This formant is actively developed through vocal training, for instance through so-called "voce di strega" or witch's voice exercises and is caused by a part of the vocal tract acting as a resonator•It lies somewhere between the third and the fourth formant•It adds brilliance and carrying power to the male singing voice

11

4. Prosodic Features of Speech

• Prosodic features are characteristics which convey meaning, emphasis, and emotion without actually changing the phonemes.

• They include pitch, rhythm, and accent• In English, prosodic features play a secondary roles to the phonemes• However, in Chinese, prosodic features change the meaning a phoneme• Prosodic features tend to indicate the emotional state of the speaker• There have been attempts to use them in “lie detection” to analyze

recorded speech for evidence of stress

12

Documents

The lungs serve as reservoir of air and a source energy In speaking, air is forced form the lungs through the larynx into the three main cavities: the