Vowels (again)

Vowels (again) February 23, 2010

The NewsFor Thursday:Give me a (one paragraph or so) description of what youre thinking of doing for a term projectAlso note: two new readings have been postedPeterson & Barney (1952)Liljencrants & Lindblom (1972)

Fun Stuff Who is producing each of these vowels? (And which vowel are they producing?)

Source/Filter Lab Review Silke made predictions on the basis of her formant values:

Practical Stuff So you want to plot your formant space

Source/Filter Lab Review Stephanie made an interesting (general) prediction:

Peterson & Barney (1952)Gordon Peterson and Harold Barney conducted a landmark study of variability in the production and perception of English vowels way back in 1952.Methods:Recorded speakers of General American English reading a list of 10 hVd words (heed, hid, head, etc.) twice. 76 speakers (33 men, 28 women, 10 children)Measured the F0, F1, F2 and F3 from the midpoint of all 1520 vowels.Presented all 1520 vowels to 70 listeners in a vowel identification experiment (in eight sessions).

Peterson & Barney (1952) Acoustically, they found much variability in vowel production Also: much overlap in terms of absolute formant frequencies General confirmation of F1-F2 vowel space schema herd distinguished by low F3.

Peterson & Barney (1952)

Peterson & Barney (1952) They organized their response data in the form of a confusion matrix. Each row corresponds to the intended vowel = the stimulus category Each column corresponds to the classification made by the listeners = the response category

Peterson & Barney (1952) Some confusion matrix basics: Entries on the main diagonal represent correct responses. Entries off the main diagonal represent the confusions Popular confusions here include: hod perceived as hawed (1013 / 10273) hid perceived as head (694 / 10279)

Peterson & Barney (1952) Summing up the columns provides a rough sense of the listeners response bias = tendency to favor one response category over another, independent of the stimulus presented Popular options: had (10906), hawed (10737) Not-so-popular: hid (9813), hud (9956)

Peterson & Barney (1952) Note: listeners identified only 94.4% of vowels correctly heed, whod and herd were highly distinct; hod and head were not The available response options in the neighborhood matter

Source/Filter Lab Review Sue plotted some confusion matrices:

Source/Filter Lab Review Rhonda (and Jon) broke things down by features:

Class Confusion Matrix This is the response data summed across all conditions From all five listeners.

Back to Perturbation Theory Basic idea #1: vocal tract resonances (formants) are the result of standing waves in the vocal tract These standing waves have areas where velocity alternates between high and low (anti-nodes), and areas where velocity does not change (nodes)

Perturbation Principles Basic Idea #2: constriction at a velocity anti-node decreases a resonant frequencyanti-nodeanti-node

Perturbation Principles Basic Idea #3: constriction at a velocity node increases a resonant frequencynodenode

Labial Constrictions in the labial region are at anti-nodes for both F1 and F2. Labial constrictions decrease both F1 and F2

Labial Constrictions in the palatal region are at an F2 node and near an F1 anti-node F1 decreases; F2 increasesPalatal

Labial Constrictions in the velar region are at an F2 anti-node and near an F1 anti-node F1 decreases; F2 decreasesPalatalVelar

Labial Constrictions in the pharyngeal region are at an F2 anti-node and near an F1 node F1 increases; F2 decreasesPalatalVelarPharynx

Labial Constrictions in the laryngeal region are at an F2 node and an F1 node F1 increases; F2 increasesPalatalVelarPharynxLarynx

Different Sources For a particular articulatory configuration, the vocal tract will resonate at a certain set of frequencies no matter what the sound source is. (Remember the talk box) Lets see what happens when we change our sound source to a duck call

Duck Call Vowelshttp://www.exploratorium.edu/exhibits/vocal_vowels/vocal_vowels.htmlduck call is placed here Now lets filter the duck call with differently shaped plastic tubes. Care to make any predictions?

Another View [i]

Duck Call Spectrograms [i]

Duck Call Spectra[i]

How About These?duck call is placed on this side

[i] vs. [e][i][e]

[u] vs. [o] [u][o]

Philosophical Fragments Consider the Cardinal Vowels. Two anchor vowels: [i] - Cardinal Vowel 1 - highest, frontest vowel possible - Cardinal Vowel 5 - lowest, backest vowel possible Remaining vowels are spaced at equal intervals of frontness and height between the anchor vowels. Note: [u] - Cardinal Vowel 8 - may serve as a third anchor as the highest, backest, roundest vowel possible Q: Why are the first two anchors unrounded While the third anchor is rounded?

Cardinal Vowel Diagram o

Secondary Cardinal Vowels

Perturbation to the Rescue! Rounding back vowels takes advantage of an acoustic synergywhich lowers both F1 and F2.LabialPalatalVelarPharynxLarynxQ: Is there anything wrong with rounding other (non-back) vowels?

A Bad Vowel Space One answer is found in the typical structure of vowel systems. For instance, a five vowel system is rarely, if ever, distributed thusly:

[i] [e] []

Five Vowel Spaces Many languages with only five vowels spread them out evenly in the vowel space in a triangle Heres a popular vowel space option:

iu

eo

a

Five-Vowel Spaces

Gujarati Vowel Space

A Complicated Vowel Space The language is Swedish.

Adaptive Dispersion Theory Developed by Bjorn Lindblom and Johan Liljencrants (Swedish speakers) Adaptive Dispersion theory says: Vowels should be as acoustically distinct from each other as possible (This helps listeners identify them correctly) Solanguages tend to maximize the distance between vowels in acoustic space Note: lack of ~ distinction in Canadian English.

Liljencrants + Lindblom (1972) Attempted to quantify contrast in the vowel space. to emphasize the importance of perception in the formation of phonological structure. They start with an articulatory model of the limits of the vowel space: note: space is plotted in three formants and in mels (auditory equivalent of frequency)

Liljencrants + Lindblom (1972) Quantification of contrast in the space: Given m pairs of n vowels, Where m = (n * (n-1)) / 2 And ri2 = the Euclidean distance between the ith pair of vowels, in formant space. The perceptual goal of the system is: I.e., the more formant space between vowels, the easier they will be to distinguish from one another. Note: floating magnets analogy Also: crowded elevator analogy

Liljencrants & Lindblom (1972) In perceptually optimal systems vowels tend to spread out around the edges of the available space. There is also a trend for more high vowel contrasts than are normally found in language.

[i] - Jon, Steph, Sue, Silke, Rhonda[a] - Silke, Rhonda, Steph, Sue, Jon[o] - Rhonda, Sue, Silke, Steph, Jon

Documents

Vowels (again)