3
Elderly listeners' estimates of vocal age in adult females Sue Ellen Linville and Edward W. Korabic Department ofSpeech Pathology and •4udiology, Marquette University, Milwaukee, Wisconsin 53233 (Received 15January 1986; accepted for publication 8 April 1986) The purpose ofthis study was toprovide data on the ability ofelderly listeners toestimate the age group of women (25-35, 45-55,70-80) from phonated and whispered vowel productions. Further, comparisons were made between theperformance of these elderly listeners and results foryoung listeners reported previously [S.E. Linville and H. Fisher, J.Acoust. Soc. Am. 78,40- 48 (1985) ]. Tape recordings ofwhispered and normally phonated/a•/vowels were played to 23 elderly women forrelative age judgments. Results suggest that elderly women are not as accurate as young women in estimating age fromsustained vowel productions, although thetwolistener groups tend to categorize individual speakers similarly. Further, it appears thatlistener age isa factorin acoustic cues used in making age judgments. PACS numbers: 43.71.Gv, 43.71.Es,43.70.Fq INTRODUCTION Contemporary interest inthe aging process has included investigations of age-related characteristics of the human voice. Previous studies havedemonstrated that young listen- erscan judge the relative age of adultspeakers, even from sustained vowel productions (Ptacek and Sander, 1966; Lin- ville andFisher,1985b).In fact, age estimates are possible even from samples ofwhispered speech (Linville and Fisher, 1985b). Although young listeners can judge age fromvoice samples, theability of elderly listeners to perform thesame task has not been investigated. Also, information isnotavail- ableon the extent to whichelderlyandyoung listeners differ in theirperception of speakers' vocal age. The purpose of thisstudy was to provide data on the ability of elderly listeners to estimate theage group (25-35, 45-55, 70-80) of women fromphonated and whispered/•e/ vowels. In addition, acoustic correlates of perceived vocal age in elderly listeners were investigated. All speakers pro- duced the phonated vowels within the range of 200-220 Hz by matching a 210-Hztone. The 210-Hztone was simulta- neously input intoanoscilloscope, along withthespeaker's voice, andLissajous figures served asvisual verification of frequency matching within the20-Hzrange. This procedure was utilized in recognition ofthefact thatpitch level appears to influence age judgments (Ptacek and Sander, 1966; Ryan and Burk, 1974; Hartman,1979).It was theintention of this procedure to restrict pitch level to thepoint thatdifferences in mean F0 among speakers would nolonger be meaningful to listeners as anindex of aging. Findings of this study will be compared with findings for young listeners reported pre- viously (Linville andFisher,1985b). I. METHOD Two listening tapes were played to 23 elderly females, ranging in age from65-90 years (M ---- 74 years, s.d.---- 5.8). Listeners were active,independent elderlyladieswith no complaint of health problems or history of neurological dis- orders. Mean pure-tone air-conduction thresholds (.dB HL re:ANSI, 1969) for these womenin the betterear were 20.43 (s.d. = 9.28) at 250 Hz, 17.39 (s.d. = 11.47) at 500 Hz, 13.70 (s.d. = 8.69) at 1000 Hz, 19.57 (s.d. ---- 10.54) at 2000 Hz, 30.22 (s.d.=14.26) at 4000 Hz, and 42.61 (s.d. = 16.02) at 6000 Hz. Listeners were also given the Hearing Handicap Inventory for the Elderly (HHIE-S, Ventry andWeinstein, 1983). Noneof the listeners reported having a significant hearing handicap using thisinventory. Each listening tape contained one production from each of 75 speakers, equally divided among the three age groups. One tape contained phonated/•e/vowel segments, theother whispered ?•e? segments. Each production appeared twice for analysis of listener reliability. All items oneach tape were randomized and preceded by eightpractice items. Listeners wereinstructed to listen to each production andcircletheir age-group selection on a response form. Procedures used in recording speakers and in acoustic analysis of vowel seg- ments were reported previously (Linville and Fisher, 1985a,b).Procedures used in presenting voweltapes to lis- teners were identical to those reported previously (Linville and Fisher, 1985b), with the following exceptions: (1) El- derly listeners judgedthe tapes individually, ratherthan in a group, to allowfor individualized adjustment of intensity to a comfortable level and to monitor attention to the task, and (2) pauses ofapproximately 1minwere provided after every 30vowel presentations to counteract fatigue, with additional breaks provided asneeded. II. RESULTS A. Perceptual findings As an estimate of listenerreliability, percentages of agreement in response to the firstand second vowel presen- tations were calculated. Overall,for bothtypes of utterances, listeners demonstrated 65.8% test-retest agreement: 69.1% in response to phonated vowels and 62.4% in response to whispered vowels. This compares with 75.5% overall reli- ability foryoung listeners (Linville and Fisher, i985b). Listeners correctlyidentified the age group of these women about45% of the time from phonated vowels, and about 38% of thetimefromwhispered vowels. Both of these rates[phonated, t (22) = 8.71; whispered, t (22) = 8.45] were significantly higher (p < 0.01 ) thanrates expected for 692 J. Acoust. Soc.Am.80(2),Aug.1986; 0001-4966/86/080692-03500.80; ¸ 1986Acoust. Soc.Am.;Letters to the Editor 692 Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 131.230.73.202 On: Thu, 18 Dec 2014 13:13:44

Elderly listeners’ estimates of vocal age in adult females

Embed Size (px)

Citation preview

Page 1: Elderly listeners’ estimates of vocal age in adult females

Elderly listeners' estimates of vocal age in adult females Sue Ellen Linville and Edward W. Korabic Department of Speech Pathology and •4udiology, Marquette University, Milwaukee, Wisconsin 53233

(Received 15 January 1986; accepted for publication 8 April 1986)

The purpose of this study was to provide data on the ability of elderly listeners to estimate the age group of women (25-35, 45-55, 70-80) from phonated and whispered vowel productions. Further, comparisons were made between the performance of these elderly listeners and results for young listeners reported previously [ S. E. Linville and H. Fisher, J. Acoust. Soc. Am. 78, 40- 48 (1985) ]. Tape recordings of whispered and normally phonated/a•/vowels were played to 23 elderly women for relative age judgments. Results suggest that elderly women are not as accurate as young women in estimating age from sustained vowel productions, although the two listener groups tend to categorize individual speakers similarly. Further, it appears that listener age is a factor in acoustic cues used in making age judgments.

PACS numbers: 43.71.Gv, 43.71.Es, 43.70.Fq

INTRODUCTION

Contemporary interest in the aging process has included investigations of age-related characteristics of the human voice. Previous studies have demonstrated that young listen- ers can judge the relative age of adult speakers, even from sustained vowel productions (Ptacek and Sander, 1966; Lin- ville and Fisher, 1985b). In fact, age estimates are possible even from samples of whispered speech (Linville and Fisher, 1985b). Although young listeners can judge age from voice samples, the ability of elderly listeners to perform the same task has not been investigated. Also, information is not avail- able on the extent to which elderly and young listeners differ in their perception of speakers' vocal age.

The purpose of this study was to provide data on the ability of elderly listeners to estimate the age group (25-35, 45-55, 70-80) of women from phonated and whispered/•e/ vowels. In addition, acoustic correlates of perceived vocal age in elderly listeners were investigated. All speakers pro- duced the phonated vowels within the range of 200-220 Hz by matching a 210-Hz tone. The 210-Hz tone was simulta- neously input into an oscilloscope, along with the speaker's voice, and Lissajous figures served as visual verification of frequency matching within the 20-Hz range. This procedure was utilized in recognition of the fact that pitch level appears to influence age judgments (Ptacek and Sander, 1966; Ryan and Burk, 1974; Hartman, 1979). It was the intention of this procedure to restrict pitch level to the point that differences in mean F0 among speakers would no longer be meaningful to listeners as an index of aging. Findings of this study will be compared with findings for young listeners reported pre- viously (Linville and Fisher, 1985b).

I. METHOD

Two listening tapes were played to 23 elderly females, ranging in age from 65-90 years (M ---- 74 years, s.d. ---- 5.8). Listeners were active, independent elderly ladies with no complaint of health problems or history of neurological dis- orders. Mean pure-tone air-conduction thresholds (.dB HL re: ANSI, 1969) for these women in the better ear were 20.43 (s.d. = 9.28) at 250 Hz, 17.39 (s.d. = 11.47) at 500 Hz,

13.70 (s.d. = 8.69) at 1000 Hz, 19.57 (s.d. ---- 10.54) at 2000 Hz, 30.22 (s.d.=14.26) at 4000 Hz, and 42.61 (s.d. = 16.02) at 6000 Hz. Listeners were also given the Hearing Handicap Inventory for the Elderly (HHIE-S, Ventry and Weinstein, 1983). None of the listeners reported having a significant hearing handicap using this inventory.

Each listening tape contained one production from each of 75 speakers, equally divided among the three age groups. One tape contained phonated/•e/vowel segments, the other whispered ?•e? segments. Each production appeared twice for analysis of listener reliability. All items on each tape were randomized and preceded by eight practice items. Listeners were instructed to listen to each production and circle their age-group selection on a response form. Procedures used in recording speakers and in acoustic analysis of vowel seg- ments were reported previously (Linville and Fisher, 1985a,b). Procedures used in presenting vowel tapes to lis- teners were identical to those reported previously (Linville and Fisher, 1985b), with the following exceptions: (1) El- derly listeners judged the tapes individually, rather than in a group, to allow for individualized adjustment of intensity to a comfortable level and to monitor attention to the task, and (2) pauses of approximately 1 min were provided after every 30 vowel presentations to counteract fatigue, with additional breaks provided as needed.

II. RESULTS

A. Perceptual findings

As an estimate of listener reliability, percentages of agreement in response to the first and second vowel presen- tations were calculated. Overall, for both types of utterances, listeners demonstrated 65.8% test-retest agreement: 69.1% in response to phonated vowels and 62.4% in response to whispered vowels. This compares with 75.5% overall reli- ability for young listeners (Linville and Fisher, i985b).

Listeners correctly identified the age group of these women about 45% of the time from phonated vowels, and about 38% of the time from whispered vowels. Both of these rates [phonated, t (22) = 8.71; whispered, t (22) = 8.45] were significantly higher (p < 0.01 ) than rates expected for

692 J. Acoust. Soc. Am. 80(2), Aug. 1986; 0001-4966/86/080692-03500.80; ̧ 1986 Acoust. Soc. Am.; Letters to the Editor 692

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 131.230.73.202 On: Thu, 18 Dec 2014 13:13:44

Page 2: Elderly listeners’ estimates of vocal age in adult females

random guessing (33.3 3 % ), suggesting that elderly listeners were able to judge age from both phonated and whispered vowels. However, accuracy rates for young listeners (Lin- ville and Fisher, 1985b) were higher for both phonated ( 51% ) and whispered (43%) vowels. Indeed, an analysis of variance, comparing accuracy rates for the two listener groups, revealed that young listeners were significantly more accurate IF(1,44) - 17.71, p <0.01 ] than elderly listeners in their age estimates from both phonated and whispered vowels. Also, accuracy rates were significantly higher for phonated than for whispered vowels [F(1,44)-44.13, p <0.0! ]. The interaction between listener age and vowel type was not significant.

Data from elderly listeners (Table I) were subjected to an analysis of variance to determine if differences in accura- cy occurred as a function of type of utterance (phonated versus whispered), speaker age group, and trial (first versus second). This analysis yielded only two significant findings. First, the main effect for type of utterance was significant IF(1,72) - 8.42, p <0.01 ], indicating that judgments were more accurate from phonated vowels. Second, the age by trial interaction was significant [F(2,72) - 3.69, p < 0.05 ], indicating that elderly listeners did not respond similarly to the three age groups of speakers in both trials. Simple effects analysis of the significant interaction indicated that old speakers were identified with significantly higher rates of accuracy than the other two age groups in trial 1 [F(2,72) = 3.72, p < 0.05 ]. In trial 2, the three age groups of speakers were identified with similar rates of accuracy. It appears that, early in the task, elderly listeners identified old speakers correctly more often than the other two age groups. However, as the task progressed, listeners changed their cri- teria for "oldness," or fatigue became a factor, and the per- centage of correct responses to old speakers dropped.

Another question of some interest concerns the extent to which elderly listeners' age estimates, regardless of accura- cy, agree with age estimates by young listeners. Median age group estimates of individual speakers by old listeners corre- lated highly with age groups estimates made by young listen- ers •(Linville and Fisher, 1985b) from both phonated (r = 0.87) and whispered (r = 0.83) vowels, suggesting that the two listener groups tended to judge speakers similar- ly with regard to relative age.

B. Acoustic correlates of perceived age

1. Phonated vowels

Multiple regression analysis was performed to assess the relationship of selected acoustic measurements to median

age group estimates for each speaker by elderly listeners. Five acoustic measures served as independent variables in the multiple regression analysis on phonated vowel data. These were: mean F 0, F 0 s.d., jitter ratio, F 1 and F 2. Means, standard deviations, and ranges of these acoustic data for the three speaker groups were reported previously (Linville and Fisher, 1985a,b).

Simple correlations among elderly listeners' perceptual responses and all five acoustic measures were significant (p <0.05). Mean F0 (r -- - 0.64) was most highly corre- lated with age esimates, with F2 (r = -0.54), F0 s.d. (r-- 0.43), F1 (r = - 0.35), and jitter ratio (r-- 0.27) showing progressively lower correlations.

Although, singly, all five acoustic measures were signifi- cantly correlated with perceived age, not all of these vari- ables made a significant contribution after inter-relation- ships among acoustic measures were taken into consideration. The multiple regression yielded a multiple R of 0.77. That is, 59% of the variance in listeners' perceptual judgments was accounted for by these five variables. Exami- nation of the Beta weights, which reflect the independent contribution of each acoustic measure to age judgments, re- vealed that mean F0 (Beta -- - 0.47) made the most sub- stantial contribution to age prediction. That is, although a pitch matching task was employed in an effort to control for variations in vocal pitch, listeners used the small frequency variation that remained (20 Hz) as a clue to age. Specifical- ly, lower mean F0 values were associated with older age esti- mates. The formant measures of F2 (Beta -- -- 0.25) and F 1 (Beta- --0.21 ) ranked second and third in impor- tance, respectively, with older age estimates being associated with lower formant frequencies. This finding contrasts with young listeners for whom resonance information in phonat- ed vowels was disregarded ( Linville and Fisher, 1985b ). The measure F 0 s.d. (Beta - 0.20) ranked fourth in importance, in contrast to young listeners for whom F0 s.d. was equal in importance to mean F0 (Linville and Fisher, 1985b). Jitter ratio (Beta = -0.08) did not contribute significantly to age estimation after inter-relationships among variables were taken into consideration.

2. Whispered vowels

Results of the multiple regression analysis on whispered vowel data suggest that only changes in F 1 frequency in- fluenced elderly listeners' age judgments from whispered vowels. These findings parallel findings for young listeners (Linville and Fisher, 1985b). Specifically, using F 1 and F2

TABLE I. Percentage of correct age-group identifications as a function of utterance type, speaker age group, and listening trial. !

Young speakers • Middle-aged speakers Old speakers Utterance type Trial 1 Trial 2 Trial 1 Trial 2 Trial 1' Trial 2

Phonated vowels

Mean = 44.88 44.00 39.12 40.40 44.00 54.16 47.64

Whispered vowels Mean- 38.21 34.24 35.96 38.20 38.36 45.24 37.24

693 J. Acoust. Soc. Am., Vol. 80, No. 2, August 1986 Letters to the Editor 693

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 131.230.73.202 On: Thu, 18 Dec 2014 13:13:44

Page 3: Elderly listeners’ estimates of vocal age in adult females

as independent variables yielded a multiple R of 0.51 (F = 9.94, p <0.01 ), with 26% of the variance in age esti- mates accounted for by these two measures. F 1 frequency was negatively correlated with perceived age (r = -- 0.51 ) with a high Beta weight (Beta = --0.51), indicating that lower F 1 values were generally associated with older age estimates. F2 was uncorrelated with age estimates (r= - 0.05).

III. DISCUSSION

It appears from this investigation that elderly listeners are not as accurate as young listeners in estimating age from sustained vowel productions, although the two listener groups tend to categorize individual speakers similarly. Further, it appears that listener age is a factor in acoustic cues used in making age judgments. When judging age from phonated vowels, elderly listeners appeared to utilize a wider range of acoustic cues than did young listeners. Young lis- teners focused on voicing information (Mean F0 and F0 s.d. ) and disregarded resonance changes (lower F 1 and F 2 frequencies) that occurred with advanced chronological ag- ing (Linville and Fisher, 1985b). In contrast, elderly listen- ers utilized both resonance and voicing information when both types of information were present in the signal.

Elderly listeners also differed from young listeners in the relative emphasis placed on frequency stability informa- tion as a cue to vocal age. The F0 s.d. was the most important predictor of perceived age (Beta = 0.86) for young listeners (Linville and Fisher, 1985b). In contrast, F0 s.d. ranked fourth among the four significant predictors of perceived age for elderly listeners (Beta = 0.20). This finding may be at- tributable to the amount of hearing loss and/or age-related changes in the auditory system of elderly listeners compared with young listeners. Zurek and Formby ( 1981 ) found that hearing impaired listeners' ability to detect frequency modu- lation is diminished relative to normal-hearing listeners. They found that the difference limen for frequency (DLF) increases with hearing loss after a certain amount of loss has been exceeded. The DLF is defined as the minimum change in stimulus frequency that can be correctly judged as differ- ent from a reference stimulus. For low-frequency stimuli, only about 20-30 dB of hearing loss is necessary to produce a sizable increase in the D LF. Further, Cervellera and Quar- anta (1982) found that DLF is larger in presbycusic individ- uals than in normal-hearing young adults. This could be due to well-documented peripheral and/or central-processing deficits that increase with age (Pickett et al., 1979). There- fore, age and/or hearing loss of these elderly listeners may have inhibited their ability to detect changes in F0 and, therefore, reduced their effectiveness at using F0 s.d. as a cue to age identification.

While decreased sensitivity to F0 modulation may be one explanation for the poorer performance of elderly listen- ers in response to phonated vowels, it cannot explain their

poorer performance in response to whispered vowels. Also, elderly listeners were less reliable than young listeners in both listening conditions. Taken together, these findings suggest that overall listening performance is adversely af- fected by advanced age. Certainly decreased attention span and/or fatigue cannot be ruled out as contributing factors to elderly listeners' poorer performance, despite efforts to mini- mize such effects through modifications in the experimental design.

Although elderly listeners differed from young listeners in the emphasis placed on F0 s.d. as a cue to vocal age, both groups agreed with regard to jitter ratio. For both listener groups, jitter ratio was unrelated to age estimates, reinforc- ing an earlier suggestion that vocal roughness is not a par- ticularly salient cue to vocal age in a sustained vowel task (Linville and Fisher, 1985b).

The two groups of listeners also demonstrated similiar results with regard to mean F0. That is, despite the fact that mean F0 was restricted to a 20-Hz range (less than two semi- tones), age estimates were associated with the little mean F0 variability remaining. The fact that mean F0 is being used as an acoustic cue under these restricted conditions suggests that mean F0 is a powerful cue to perceived age in women's voices.

ACKNOWLEDGMENTS

Thanks are extended to Deb Fabrycki and Katie Bron- son for assistance with running subjects and recording data. Further, we are grateful to Judy Kulpa for assistance in 1o- caring elderly women to participate in this study. Requests for reprints should be addressed to Sue Ellen Linville, Ph.D., Department of Speech Pathology and Audiology, Mar- quette University, Milwaukee, WI 53233.

ANSI (1969). "Specification for audiometers," ANSI S3.6-1969 (Ameri- can National Standards Institute, New York).

Cervellera, G., and Quaranta, A. (1982). "Audiological findings in presby- cusis," $. Aud. Res. 22, 161-171.

Hartman, D. (1979). "The perceptual identity and characteristics of aging innormal male adult speakers," J. Commun. Disord. 12, 53-61.

Linville, S. E., and Fisher, H. (1985a). "Acoustic characteristics of wom- en's voices with advancing age," J. Gerontol. 40, 324-330.

Linville, S. E., and Fisher, H. (1985b). "Acoustic characteristics of per- ceived versus actual vocal age in controlled phonation by adult females," J. Acoust. Soc. Am. 78, 40-48.

Pickett, $., Bergman, M., and Levit, H. (1979). "Aging and speech under- standing," in Sensory Systems and Communication in the Elderly, edited by J. M. Oroy and K. Brizzee (Raven, New York).

Ptacek, P., and Sanders, E. (1966). "Age recognition from voice," $. Speech Hear. Res. 9, 353-360.

Ryan, W., and Burk, K. (1974). "Perceptual and acoustic correlates in the speech of males," $. Commun. Disord. 7, 181-192.

Yentry, I., and Weinstein, B. (1983). "Identification of elderly people with hearing problems," ASHA 25, 37-42.

Zurek, P., and Formby, C. (1981). "Frequency discrimination ability of hearing-impaired listeners," $. Speech Hear. Res. 46, 108-112.

694 J. Acoust. Soc. Am., Vol. 80, No. 2, August 1986 Letters to the Editor 694

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 131.230.73.202 On: Thu, 18 Dec 2014 13:13:44