18
Journal of Experimental Psychology: Human Perception and Performance 1984, Vol. 10, No. 3, 340-357 Copyright 1984 by the American Psychological Association, Inc. Are Lexical Decisions a Good Measure of Lexical Access? The Role of Word Frequency in the Neglected Decision Stage David A. Balota and James I. Chumbley University of Massachusetts—Amherst Three experiments investigated the impact of five lexical variables (instance dom- inance, category dominance, word frequency, word length in letters, and word length in syllables) on performance in three different tasks involving word rec- ognition: category verification, lexical decision, and pronunciation. Although the same set of words was used in each task, the relationship of the lexical variables to reaction time varied significantly with the task within which the words were embedded. In particular, the effect of word frequency was minimal in the category verification task, whereas it was significantly larger in the pronunciation task and significantly larger yet in the lexical decision task. It is argued that decision processes having little to do with lexical access accentuate the word-frequency effect in the lexical decision task and that results from this task have questionable value in testing the assumption that word frequency orders the lexicon, thereby affecting time to access the mental lexicon. A simple two-stage model is outlined to account for the role of word frequency and other variables in lexical decision. The model is applied to the results of the reported experiments and some of the most important findings in other studies of lexical decision and pronunciation. Psychologists have long been interested in the processes involved in word recognition. In studying variables that affect the speed of lex- ical access, researchers have relied heavily upon the lexical decision task (LDT). In this task the subject simply determines whether a letter string is or is not a word. When the manip- ulation of a variable causes a corresponding variation in response latency in the LDT, it has usually been assumed that the variable is having an effect on the ease of extracting suf- ficient information from a letter string to rec- ognize it as a word, that is, to access its lexical representation. Obviously, this research tech- nique makes the crucial assumption that lex- This research was partially supported by a National Institute of Mental Health postdoctoral fellowship (5-25007) to the first author and a research grant from the Centronics Corporation to the second author. Both authors contributed equally to this research project. We acknowledge the able assistance of Hayley Arnett throughout this study. We also thank Charles Clifton, Jr., Janet Duchek, Jerome Myers, James Neely, Alexander Pollatsek, and Keith Rayner for comments on an earlier version of the manuscript. David Balota is now at the University of Kentucky. Requests for reprints should be sent to James I. Chum- bley, Department of Psychology, University of Massachu- setts, Amherst, Massachusetts 01003. ical access is the only process in the LDT being affected by the manipulated variable. The re- sults of experiments presented in this article have led us to question the validity of this assumption with respect to one important variable, word frequency. Specifically, we argue that the demand characteristics of the decision process in the LDT may result in an exag- gerated role of word frequency. Finally, we propose a framework incorporating task-spe- cific decision processes, and we use it to in- terpret data from the LDT. This framework aids in understanding how word frequency produces its effect in a number of different experimental situations. The present research was initiated when an experimental result was encountered that was in conflict with the role of word frequency in three currently dominant models of word rec- ognition: Morton's (1969, 1970, 1982) classic logogen model; Becker's (1976, 1979, 1980; Becker & Killion, 1977) verification model; and Forster's (1976, 1979) "bin" model. Each of these models places major emphasis on the role of word frequency in lexical access. The basic finding addressed by these models is that high-frequency words are recognized more quickly than low-frequency words. Full ex- position of these models is provided in the 340

Copyright 1984 by the 1984, Vol. 10, No. 3, 340-357 Are ... · 1984, Vol. 10, No. 3, ... (instance dom-inance, category dominance, ... (1969) instance dominance and a measure of category

  • Upload
    danganh

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Journal of Experimental Psychology:Human Perception and Performance1984, Vol. 10, No. 3, 340-357

Copyright 1984 by theAmerican Psychological Association, Inc.

Are Lexical Decisions a Good Measure of Lexical Access?The Role of Word Frequency in the Neglected Decision Stage

David A. Balota and James I. ChumbleyUniversity of Massachusetts—Amherst

Three experiments investigated the impact of five lexical variables (instance dom-inance, category dominance, word frequency, word length in letters, and wordlength in syllables) on performance in three different tasks involving word rec-ognition: category verification, lexical decision, and pronunciation. Although thesame set of words was used in each task, the relationship of the lexical variablesto reaction time varied significantly with the task within which the words wereembedded. In particular, the effect of word frequency was minimal in the categoryverification task, whereas it was significantly larger in the pronunciation task andsignificantly larger yet in the lexical decision task. It is argued that decision processeshaving little to do with lexical access accentuate the word-frequency effect in thelexical decision task and that results from this task have questionable value intesting the assumption that word frequency orders the lexicon, thereby affectingtime to access the mental lexicon. A simple two-stage model is outlined to accountfor the role of word frequency and other variables in lexical decision. The modelis applied to the results of the reported experiments and some of the most importantfindings in other studies of lexical decision and pronunciation.

Psychologists have long been interested inthe processes involved in word recognition. Instudying variables that affect the speed of lex-ical access, researchers have relied heavily uponthe lexical decision task (LDT). In this taskthe subject simply determines whether a letterstring is or is not a word. When the manip-ulation of a variable causes a correspondingvariation in response latency in the LDT, ithas usually been assumed that the variable ishaving an effect on the ease of extracting suf-ficient information from a letter string to rec-ognize it as a word, that is, to access its lexicalrepresentation. Obviously, this research tech-nique makes the crucial assumption that lex-

This research was partially supported by a NationalInstitute of Mental Health postdoctoral fellowship(5-25007) to the first author and a research grant fromthe Centronics Corporation to the second author. Bothauthors contributed equally to this research project.

We acknowledge the able assistance of Hayley Arnettthroughout this study. We also thank Charles Clifton, Jr.,Janet Duchek, Jerome Myers, James Neely, AlexanderPollatsek, and Keith Rayner for comments on an earlierversion of the manuscript. David Balota is now at theUniversity of Kentucky.

Requests for reprints should be sent to James I. Chum-bley, Department of Psychology, University of Massachu-setts, Amherst, Massachusetts 01003.

ical access is the only process in the LDT beingaffected by the manipulated variable. The re-sults of experiments presented in this articlehave led us to question the validity of thisassumption with respect to one importantvariable, word frequency. Specifically, we arguethat the demand characteristics of the decisionprocess in the LDT may result in an exag-gerated role of word frequency. Finally, wepropose a framework incorporating task-spe-cific decision processes, and we use it to in-terpret data from the LDT. This frameworkaids in understanding how word frequencyproduces its effect in a number of differentexperimental situations.

The present research was initiated when anexperimental result was encountered that wasin conflict with the role of word frequency inthree currently dominant models of word rec-ognition: Morton's (1969, 1970, 1982) classiclogogen model; Becker's (1976, 1979, 1980;Becker & Killion, 1977) verification model;and Forster's (1976, 1979) "bin" model. Eachof these models places major emphasis on therole of word frequency in lexical access. Thebasic finding addressed by these models is thathigh-frequency words are recognized morequickly than low-frequency words. Full ex-position of these models is provided in the

340

LEXICAL DECISION AND LEXICAL ACCESS 341

earlier references, but there are three importantassumptions common to all three models thatare relevant to the present research: (a) Lexicalaccess involves some matching of the featuresextracted from the stimulus to an internal rep-resentation of words; (b) word frequency de-termines the availability of lexical represen-tations either by ordering them or by affectingtheir thresholds; (c) higher order semantic in-formation for a word presented in isolationbecomes available only after lexical access hastaken place. These assumptions are importantbecause the present research raises seriousquestions about whether the results from theLDT, the principal task used to investigatelexical access, can be unequivocally used assupport for the assumptions.

The three assumptions common to thesemodels are intended as a general character-ization of the process of lexical access and, assuch, should not be task dependent in theirapplicability. All tasks that involve lexical ac-cess should reflect these basic assumptions.Recently, we unexpectedly found that therewas virtually no effect of word frequency ina task that should involve lexical access. Thetask was a simple category-exemplar verifi-cation task in which a category name (e.g.,bird) was first presented and then was followed800 ms later by an exemplar from that category(e.g., robin) or from a different category (e.g.,sofa). The subject's task was to make a yes-no judgment about the validity of the category-exemplar relationship being presented. Al-though the results of this study yielded an in-teresting pattern of effects of instance domi-nance (likelihood of producing the exemplargiven the category name) and category dom-inance (availability of the category name giventhe exemplar) on verification time, there wasvirtually no influence of word frequency oneither trials in which a yes response was corrector, more important, on trials in which a noresponse was correct. The data from the yestrials are of marginal interest because onemight expect a reduced effect of word fre-quency on these trials since the category namemay have semantically primed the exemplarsof that category, and this priming may havediminished any frequency effect. In fact,Becker (1979) has recently reported such aninteraction between prime relatedness andword frequency in an LDT.1 On the other

hand, the lack of a word-frequency effect inthe category-exemplar no response data isconsiderably more difficult to dismiss. Onthese trials, the category name could not havesemantically primed an exemplar from an-other category. Therefore, the subject had toaccess the lexisal entry for the exemplar andonly then extract sufficient semantic infor-mation to make a decision. Because word fre-quency presumably influences the lexical ac-cess process, it clearly should have had an effecton the trials in which a no response was re-quired.

In pursuing this theoretically discrepantfinding, we decided to replicate the category-verification study and increase the power (bydoubling the number of subjects) to detect aword-frequency effect. The approach takenwas to simultaneously consider a number ofvariables, some that should influence lexicalaccess and others that should not. Therefore,word frequency and the length of the word inletters and in syllables, (variables from the for-mer class) were considered along with Battigand Montague's (1969) instance dominanceand a measure of category dominance basedon category selectiori time. Because we wereinterested in the unique effect of word fre-quency above and beyond the influence ofthese other variables, we utilized a multipleregression analysis. Through this approach,we could use a relatively large set of wordswithout restricting our selection of items tothose that orthogonally vary on each of thevariables of interest. Attempts to orthogonallymanipulate several lexical and semantic vari-ables simultaneously could lead to a set ofstimuli containing unusual items that are un-representative of the general population ofwords. Consider, for example, the difficulties

1 It is worth noting here that even on a category-exemplaryes trial, one should expect some, albeit reduced, frequencyeffect. That is, in the Chumbley (1984) study half of thewords had high instance dominance ratings and half hadlow instance dominance ratings, as measured by Battigand Montague (1969). Because it seems unlikely that acategory name will sufficiently prime the items with lowinstance dominance (e.g., furniture-mirror) to completelyoverride the word-frequency effect, one should simply finda reduced effect. This is especially the case when one con-siders that in the Becker (1979) study a 49-ms word-fre-quency effect was still found in his LDT for the relatedtargets that followed highly associated primes.

342 DAVID A. BALOTA AND JAMES I. CHUMBLEY

Table 1Intercorrelations, Means, and Standard Deviations for the Predictor Variables

Predictor variable 1

1. IDOM2. LFREQ3. CDOM4. LENG'5. SYLL'

MSDTolerance

1.00.45.33.30.21

144.61153.07

.68

1.00.05.46.36

51.276.76

.68

1.00.12.16

-962.22177.72

.82

1.00.70

-5.741.57.45

1.00-1.65

0.63.50

Note. IDOM = instance dominance; LFREQ = log word frequency; CDOM = category dominance; LENG' - reflectednumber of letters; SYLL' = reflected number of syllables. The CDOM, LENG', and SYLL' are reflected values obtainedby multiplying by —1. These reflections were performed to place" these predictors in the same relationship to RT asIDOM and LFREQ, that is, the larger the value of the lexical variable, the shorter the expected RT. Tolerance = 1 - thesquare of the multiple correlation of one predictor variable with the remaining predictor variables.

one would encounter in finding a high-fre-quency word that has low category dominance,high instance dominance, nine letters, and onlyone syllable. Faced with such a trade-off, weelected to use multiple regression techniquesto examine the role of these variables.

Experiment 1

Method

Subjects. Twenty undergraduate students recruitedfrom the subject pool at the University of Massachusetts,Amherst, participated in partial fulfillment of course re-quirements. No subject participated in more than one ofthe present experiments.

Apparatus. The experiment was controlled by a NorthStar Horizon computer. Stimulus-words were displayed inuppercase letters on a television monitor driven by anIMSAI memory-mapped video raster generator. In orderto increase legibility, stimuli were presented with a singlespace between letters. Subjects were seated approximately50 cm from the video monitor. A three-letter word (threeletters separated by two spaces) occupied a visual angleof approximately 1.1°, whereas a nine-letter word occupieda visual angle of approximately 3.7°. Reaction time (RT)and interval timing were both measured with millisecond(ms) accuracy via the computer. The same apparatus wasalso used in Experiments 2 and 3.

Materials. A total of 72 target words was selected fromRosch (1975) for use in this study. They consisted of eightexemplars from each of nine different categories. Eachcategory was represented by four high-typical and fourlow-typical exemplars. (The complete list of target wordsalong with each word's mean reaction times and percentageof error rates for all the experiments is available from theauthors.) A total of 50 buffer/practice words was selectedwith 5 instances from each of 10 categories in the Battigand Montague (1969) norms. These buffer/practice wordsapproximately matched the targets in both length and syl-lables and were not members of any of the target categories.

The category dominance measure was obtained from

an independent group of 20 subjects from the same subjectpool. Subjects were first given a list of the nine categorynames to memorize. They were then given repeated trialson which an exemplar of one of the nine categories waspresented, and the'task was to say aloud the name of thecategory to which the exemplar belonged. The sound ofthe subject's saying the category name triggered a voicekey. The vocal RT is the category selection time (Sanford& Seymour, 1974) for the exemplar. Although full detailswill be reported in a future report (Chumbley, 1984), itis noteworthy that subjects were well practiced with thecategory names, and each exemplar was tested severaltimes.

Table 1 presents the correlation matrix, mean values,and standard deviation of these values for each of thelexical variables used in the studies reported here. Eachmeasure is based on the values for the 72 target words.Instance dominance (IDOM) was measured by using theBattig and Montague (1969) norms. Based on observationsmade by Whaley (1978), log word frequency (LFREQ) wasused instead of raw frequency (/), determined from theKucera and Francis (1967) norms, and the following func-tion: LFREQ = 40 + 10 log(/"+ 1). Category dominance(CDOM) is simply the category selection time for each wordmultiplied by —1.

The tolerance values given in Table 1 are measures ofthe intercorrelation of the predictor variables. They rep-resent the extent to which a predictor variable is simplya linear combination of the other predictor variables en-tered into the regression. The maximum tolerance valuepossible is 1.0 (totally orthogonal predictor), and the min-imum tolerance value is 0 (totally predicted by the othervariables). As can be seen, each predictor variable, withthe exceptions of length and syllables, has a relatively hightolerance with the other variables. The tolerance valuesof length and syllables are reduced primarily because oftheir high (.70) intercorrelation.

Procedure. None of the subjects knew in advance whichcategories would be used in the experiment. Each subjectwas presented two blocks of 30 practice trials before beingtested with the target words. Thus, subjects experienced60 practice trials using the 10 buffer/practice nontargetcategories before responding to words from any of the

LEXICAL DECISION AND LEXICAL ACCESS 343

target categories. Following the two blocks of practice trials,eight blocks of test trials were presented. Each block had36 test trials that followed 5 buffer trials at the beginningof each block. The eight test blocks were divided into twocycles. Each of the 72 target words was presented twicein a cycle, once with the appropriate category name andonce with one of the other eight target category names.These 144 conditions were randomly ordered within eachcycle.

On each trial the following sequence of events occurred:(a) a 500-tfz warning tone presented for 250 ms; (b) a250 ms-interstimulus interval; (e) a category name pre-sented for 800 ms; (d) an exemplar presented below thecategory name until the subject responded by pulling oneof two levers; (e) either a 2-s intertrial interval or an errormessage that was terminated by the subject's pressing athird button (placed between the response levers) that wasthen followed by the 2-s interval before the start of thenext trial.

All subjects were tested individually in a sound-deadenedroom. Two response levers, one designated yes and theother no, were placed so that the subject's fingers couldcomfortably rest against them. Subjects were instructedto decide as quickly as possible whether each exemplarwas a member of the designated category while maintainingan accuracy level of at least 95% correct. Feedback re-garding both RT and accuracy was given after each blockof trials. There was a 10-s mandatory rest between blocksfollowed by a signal that the subject could continue theexperiment by a button press when ready. Ten subjectsused their dominant hand to respond yes, and 10 usedtheir nondominant hand to respond yes.

Results

There were four sets of data for each subjectdefined by the factorial combination of Re-sponse (yes or no) X Cycle (Test Blocks 1-4or Test Blocks 5-8). Each subject's data foreach condition were scored for correctness andfor outliers in RT. Outliers were denned asbeing either (a) less than 200 ms or (b) longerthan 2 s and also more than three standarddeviations above the subject's mean for thecondition. Errors and outliers were replacedwith the subject's mean correct response RTfor that condition. The mean percentage oferror rate per subject (across conditions) was4.48, and the mean percentage of outlier ratewas 0.12.

A full multiple regression analysis was con-ducted on the data for each response and eachcycle. This procedure parallels that used byother workers investigating category verifica-tion (Anderson & Reder, 1974; Loftus &Suppes, 1972) and lexical decision (Whaley,1978). The fujl analysis was used instead of astepwise analysis because we had identified inadvance which variables were of theoretical

importance, and we were not primarily in-terested in comparing the relative importanceof the variables within a task. Our concernwas in comparing the relative importance ofa given variable in one task to its importancein other tasks. The mean RT across subjectsfor each word was determined, and thesemeans were the criterion variable in theregression analysis. The five predictor variableswere IDOM, LFREQ, CDOM, LENG' (length inletters), and SYLL' (number of syllables).

When predictor variables have widely dif-ferent ranges, as is the case here, the actualvalue of the regression coefficient (Beta) is notvery helpful in evaluating the size of the effectproduced by a predictor variable. For this rea-son, we have chosen to present what will bereferred to as a semistandardized regressioncoefficient. These coefficients are simply theproduct of Betd and the standard deviationof the criterion variable. The correlations, Fs,and partial correlations are unchanged by thisprocedure. The semistandardized regressioncoefficient indicates the change in RT withone standard deviation unit change in the pre-dictor variable. Table 2 presents the semistan-dardized regression coefficients and the rawand partial correlations with RT for each pre-dictor variable.

Through the use of the semistandardizedregression coefficients, one can compare thesize of the effects of the predictor variables inthis experiment with those in other studies.Approximately 95% of the words had predictorvariable values within two standard deviationunits of either side of the mean. Multiplyingthe semistandardized regression coefficient by4 yields the approximate change in RT pro-duced by changing the value of a predictorvariable from its lowest value to its highestvalue. Thus, LFREQ produced only a small 24-ms effect for yes responses in Cycle 1, whereasIDOM produced a large 160-ms effect, andCDOM produced an even larger 200-ms effect.It should be noted that these are unique effectsover and above the effects jointly producedwith the other variables.

The results of the multiple regression anal-yses indicated that only IDOM, CDOM, andLENG' had any significant unique relationshipto RT in category verification. For the yes re-sponses, the regression coefficients for IDOMwere highly significant in both Cycle 1, F(l,

344 DAVID A. BALOTA AND JAMES I. CHUMBLEY

66) = 19.72, p < .001, and Cycle 2, F( 1,66) =14.45, p< .001. Similarly, the CDOM regressioncoefficients were highly significant for yes re-sponses in both Cycle 1 and Cycle 2, F(\,66) = 47.01, p < .001, and F(l, 66) = 37.03,p < .001, respectively. For no responses, LENG'in Cycle 1, F(l, 66) = 4.53,p< .05, and IDOMin Cycle 2, F(l, 66) = 4.76, p < .05, hadsignificant effects. The unique influence ofLFREQ did not approach significance for eitherthe yes responses (both .Fs < 1.10) or for theno responses (both Fs < 2.24). None of theremaining predictor variables yielded signifi-cant regression coefficients. The value of R2

for each of the regression analyses was signif-icant, all Fs(5, 66) > 2.85, ps < .05. Thesevalues were .66 for yes responses, Cycle 1; .59for yes responses, Cycle 2; .22 for no responses,Cycle 1; and .18 for no responses, Cycle 2.

Each exemplar was actually presented twiceduring each cycle, once when a yes responsewas appropriate and once when a no responsewas appropriate. We were concerned that this

might influence the size of the predictor vari-able effects, especially that of LFREQ. Regres-sion analyses were conducted for the first pre-sentation of an exemplar in either the yes orthe no correct response condition of Cycle 1.The results of these analyses were quite clear.The unique effect of LFREQ was not significanton the very first presentation of an item aseither a yes, F(l, 66) < 1, or a no, F(\, 66) =1.98. This lack of a significant unique effectof LFREQ cannot be attributed to a lack ofpower because the pattern of significant effectswas exactly the same as that displayed in Table2 for both yes and no responses. Thus, it seemsfair to conclude that word repetition within acycle is not diluting the LFREQ effect.

Discussion

The results from Experiment 1 were quitestraightforward. For the category-exemplar yestrials, there were large unique effects of IDOMand CDOM but not LFREQ. On the other hand,

Table 2Regression Coefficients, Raw Correlations, and Partial Correlations for the CategoryVerification Experiment

Cycle

Predictor variable

IDOM LFREQ CDOM LENG' SYLL'

Regression coefficientrPartial r2

-37.62-.64

.23

Yes responses

-6.07-.32

.01

-52.62-.66

.42

No responses

-16.45-.21

.04

6.09-.07

.01

Regression coefficientr

*• Partial r2 .

-25.95-.60

.18

-7.11-.31

.02

-37.64-.65

.36

-4.71-.14

.00

2.25-.04

.00

Regression coefficientrPartial r2

2Regression coefficientrPartial r2

-4.81-.30

.01

-9.42-.39

.07

-10.53-.36

.03

-2.95-.27

.01

-7.77-.17

.02

-2.67-.17

.01

-18.36-.35

.06

-0.09-.20

.00

7.29-.17

.01

-3.58-.20

,01

Note. IDOM = instance dominance; LFREQ = log word frequency; CDOM = category dominance; LENG' = reflectednumber of letters; SYLL' = reflected number of syllables. The mean RT for yes responses, Cycle 1, was 662.96 ms(SD = 96.82 ms; for yes responses, Cycle 2, it was 606.61 ms (SD = 71.56 ms); for no responses, Cycle 1, it was693.33 ms (SD = 53.29 ms); and for no responses, Cycle 2, it was 628.63 ms (SD = 31.80 ms).

LEXICAL DECISION AND LEXICAL ACCESS 345

for the category-exemplar no trials there wereonly small effects of LENG' during Cycle 1 andIDOM during Cycle 2. These results suggestthat in verifying that an exemplar is a memberof a category, the subject may access the cat-egory name (as evidenced by the strong effectof CDOM on the yes responses). For the presentpurposes, however, the more important findingis that there was little, if any, unique impactof word frequency on either the category ver-ification yes or no trials. As noted earlier, thelack of a robust frequency effect for the notrials is particularly perplexing because it isunclear, within the currently available modelsof lexical access, why word frequency shouldnot produce an independent influence on amisprimed category-verification no decision(e.g., bird-sofa), a decision that must certainlyrequire lexical access. In this light, it is note-worthy that Anderson and Reder (1974) andMillward, Rice, and Corbett (1975) have alsoreported failures to find an impact of instancefrequency on instance-category and category-instance no decision trials, respectively.

Before considering the implications of theseresults in more detail, a number of alternativeexplanations must be considered. One alter-native explanation could be that there was onlya very small frequency effect in the presentcategory-verification task because there weremultiple exemplars from the nine target cat-egories, and some sort of implicit semanticpriming could account for the lack of a uniquefrequency effect. Five converging lines of ar-gument reduce the plausibility of this expla-nation. First, it is unclear how such primingcould be effective on no response trials whenthe exemplar is being inappropriately primedby the name of another category. Second, therewas little evidence that the frequency effectwas disappearing across trials as subjects be-came more and more familiar with the cate-gories. Third, prior to the first cycle, subjectsshould have actually been misprimed by the10 nontarget buffer/practice categories becausethey received 65 buffer/practice trials usingexemplars from these categories before theyreceived any exemplars from the target cate-gories. These items were also used in the 5buffer trials at the beginning of each test block.Fourth, it will be seen in Experiment 3 thatwhen the number of exemplars from each cat-egory was doubled, there was virtually no im-

pact on the obtained frequency effect. Thissuggests that increasing the emphasis on thetarget categories did not appreciably influencethe word-frequency effect, at least not in apronunciation task. Finally, Becker (1979)found a significant frequency effect (49 ms)in a lexical decision task, albeit reduced incomparison to an unrelated priming condition,when each word was primed on a given trialby a high associate. It is unlikely that implicitpriming for misdirected category-exemplar notrials could produce more semantic primingthan that produced by highly related associ-ates.

Although the implicit priming account ofthe results of the first experiment is inadequate,there are two simpler accounts that must beaddressed. First, the range of word frequencyfor the words utilized in Experiment 1 maynot have been sufficiently large to produce arobust frequency effect. The words were orig-inally selected so that college sophomoreswould have little difficulty knowing the mean-ing of even the lowest frequency words usedin the current studies. It is possible that partof the word-frequency effect reported in theliterature can be attributed to subjects' notknowing the meaning of a particular wordrather than the frequency of occurrence ofthat word in print. For example, in one studyinvestigating word-frequency effects (Freder-iksen & Kroll, 1976) error rates in excess of40% were reported for the low-frequency tar-gets in an LDT where chance performanceis 50%.

A second possibility is that the word-fre-quency effect found in previous studies wasactually an effect of'other variables, such as,IDOM and LENG', which covary with word fre-quency. Because our major interest was in theunique effect that could be unequivocally at-tributed to word frequency, we used theregression analysis to remove the joint effectsof other variables. It is possible, therefore, thatthere is very little unique effect of word fre-quency on lexical access above and beyondthese covarying variables.

The most obvious way to test these possi-bilities is to conduct a lexical decision exper-iment with the same set of words and the sameset of predictor variables. This was accom-plished in Experiment 2. If the lack of a largeunique frequency effect in Experiment 1 was

346 DAVID A. BALOTA AND JAMES I. CHUMBLEY

Table 3Regression Coefficients, Raw Correlations, and Partial Correlations for Word Responsesin the Lexical Decision Experiment

Predictor variable

Cycle

1Regression coefficientrPartial r2

2Regression coefficientrPartial r2 ,-

IDOM

-10.29-.53

.05

-10.01-.54

.07

LFREQ

-24.29-.66

.24

-19.59-.63

.22

CDOM

-10.66-.25

.07

-7.76-.26

.05

LENG'

-19.46-.51

.12

-11.99-.44

.06

SYLU

6.09-.31

.01

4.71-.26

.01

Note. IDOM = instance dominance; LFREQ = log word frequency; CDOM = category dominance; LENG' = reflectednumber of letters; SYLL' = reflected number of syllables. The mean RT for Cycle 1 responses was 570.82 ms (SD =55.04 ms), and for Cycle 2 responses it was 543.22 ms (SD = 44.77 ms).

simply due to the fact that the words or analysisprocedures somehow did not allow a frequencyeffect to be demonstrated, then frequencyshould have little impact in the following lex-ical decision experiment.

Experiment 2

Method

Subjects. Twenty undergraduate students were recruitedfrom the same pool described in Experiment 1.

Materials. The words in Experiment 1 were used astargets in Experiment 2. An additional 122 words werechosen from the Battig and Montague (1969) norms. Thesewords were from the same categories as the target items(8 from each of the 9 target categories and 5 from eachof the 10 buffer/practice categories). Pronounceable non-words were produced by changing up to three letters withinthese words (e.g., fishing was changed tofisleng). The meanlength of the nonwords (5.76) closely matched the meanlength of the target words (5.74).

v Procedure. The procedure in Experiment 2 was thesame as that in Experiment 1 except that now the subject'stask was to make a lexical decision about the exemplaror nonword, and these stimuli were not preceded by thecategory name. Subjects indicated their decision by pullingone of the two response levers. All target words and theirnonword counterparts occurred once during the Test Blocks1-4 (Cycle 1) and once during the Test Blocks 5-8 (Cycle2). Prior to the presentation of the test blocks, subjectswere presented two practice blocks using the practice/buffer items. The subjects were not informed about thecategorical structure of the stimuli. Across subjects, dom-inant and nondominant hands were balanced across wordand nonword responses.

Results

Regression analyses, as described above,were performed on word RT (corrected for

errors and outliers and then averaged acrosssubjects as in Experiment 1) with the five pre-dictor variables previously used. The resultsof these analyses for Cycles 1 and 2 may beseen in Table 3. The mean percentage of errorrate was 3.89, and the mean percentage ofoutlier rate was 1.25.

The results of Experiment 2 are strikinglydifferent from those of Experiment 1. Theregression coefficients for LFREQ are verylarge both in absolute terms and in comparisonto those found in category verification. LENG'had a fairly large effect in Cycle 1 but had areduced effect in Cycle 2. CDOM and IDOMproduced relatively small regression coeffi-cients compared to their large effects on yesRT in Experiment 1. Multiplying the semi-standardized regression coefficients by 4 pro-duces approximately a 100-ms unique effectof LFREQ in lexical decision, Cycle 1, but only40-ms effects for IDOM and CDOM.

The results of the multiple regression anal-yses supported the above observations. TheLFREQ regression coefficient was highly sig-nificant in both Cycle 1, F( 1,66) = 21.05, p <.001, and Cycle 2, F(l, 66) = 18.31, p < .001.The regression coefficient for LENG' was sig-nificant in both Cycle 1, F(l, 66) = 9.00, p <.01, and Cycle 2, F(l, 66) = 4.57, p < .05. Ofthe remaining variables, only the regressioncoefficients for CDOM in Cycle 1 and IDOM inCycle 2 were significant, F(l, 66) = 4.92, p <.05, and F(l, 66) = 4.75,p < .05, respectively.The values of R2 were highly significant forboth Cycle 1 and Cycle 2, both Fs(5, 66) >

LEXICAL DECISION AND LEXICAL ACCESS 347

14.98, p < .01. R2 for Cycle 1 was .59 and forCycle 2 it was .53.

Discussion

The results of the second experiment wereagain quite clear and differed dramaticallyfrom the results of the first experiment. Thesame set of words and predictor variables wereused in both experiments. Results of the sec-ond experiment yielded large effects of LFREQover and above the effects of the other variables,whereas Experiment 1 yielded little evidenceof a unique frequency effect in category ver-ification. These contrasting results are incon-sistent with explanations of the results of Ex-periment 1 that attribute the small word-fre-quency effect to properties of the words or ofthe regression analysis technique.

Before we describe a possible account of thediffering results of the first and second exper- viments, there is another noteworthy result ofthe second experiment. It was found that bothCDOM and IDOM significantly predicted lexicaldecision performance over and above theirshared predictiveness with LFREQ, LENG', andSYLL'. This finding is important because it hastypically been argued that semantic infor-mation becomes available only after lexicalaccess. In this light, the results of the secondexperiment indicate that either this argumentis incorrect or that some other component oflexical decision is sensitive to meaning vari-ables. Other investigators have reported similarfindings. James (1975), Whaley (1978), andChumbley and Balota (1984) have reportedeffects of semantic variables in lexical decisionperformance. Thus, there is mounting evi-dence that the LOT is not a good tool forstudying a lexical access process that is pre-sumed to be unaffected by semantic variables.

Recently, a number of theorists (Forster,1979; Theios & Muise, 1977; West & Stan-ovich, 1982) have suggested that the LOT in-volves postrecognition processing that may in-fluence performance. These theorists furthersuggest that when one considers contextual ef-fects, the pronunciation task may be a betterreflection of pure lexical access (see, however,Coltheart, Davelaar, Jonasson, & Besner,1977). In light of this possibility, it was decidedto use the set of materials and predictor vari-ables from the previous two experiments in a

pronunciation task. Very simply, if the se-mantic effects found in the second experimentreflected postaccess semantic influences on thedecision process, then one might expect theseeffects to be eliminated in a pronunciationtask.

In Experiment 3 we also attempted to ad-dress whether the semantic effects in the secondexperiment may have been due to a type ofimplicit priming. That is, across trials, the cat-egories and their respective names may havebecome sufficiently activated so that subjectswere using semantic information in makingtheir lexical decisions. A second possibility isthat, across trials, items from the same se-mantic category may have occurred on ad-jacent trials, thereby producing some intertrialsemantic priming. In an attempt to addressthese possibilities, and because we were notinterested in the pronunciation of nonworditems, we replaced the nonwords used in thesecond experiment with either the word blankor eight other exemplars from each of the ninesemantic categories. If the semantic effectsfound in the LOT were simply due to the load-ing of the semantic categories, then we shouldfind larger semantic effects in the conditionwhere there are twice as many exemplars fromeach of the categories.

Experiment 3

MethodSubjects. Forty undergraduate students were recruited

from the same subject pool described earlier. Twenty sub-jects participated in the word/word condition and 20 inthe word/blank condition.

Materials. The words from ,the earlier experimentswere again used in Experiment 3. For the word/word con-dition, the nonwords used in Experiment 2 were replacedby the words from which those nonwords were derived(e.&,fisleng was replaced by fishing). For the word/blankcondition, the word blank was simply presented on halfthe trials. Thus, on test trials in which a nonword waspresented during Experiment 2, either a word from oneof the target categories was presented or the word blankwas presented.

Procedure. In this experiment, subjects were simplyasked to pronounce each word aloud. A voicekey connectedto the computer detected onset of their pronunciation andmeasured latencies to the nearest millisecond. When apronunciation was detected, the message "Response OK?"immediately replaced the word. Subjects were instructedto pull the right lever if they felt that their correct pro-nunciation of the word triggered the computer and to pullthe left lever if they incorrectly pronounced the word orif some other auditory sound (such as a cough) triggered

348 DAVID A. BALOTA AND JAMES I. CHUMBLEY

the computer. If the OK lever was pulled, there was a2-s interval before the warning tone was presented to beginthe next trial. If the error lever was pulled, the subject hadto press the third button when ready to begin the 2-sintertrial interval.

Results

Regression analyses were performed as be-fore on the data for each cycle of each group.The mean percentage of error rates for theword/word and word/blank groups were 1.46and 1.42, respectively, whereas the corre-sponding mean percentage of outlier rates were0.8 and 1.01.

The regression coefficients for LFREQ seenin Table 4 appear to be smaller than thosefound for lexical decision (see Table 3). It seemsthere is a smaller unique frequency effect inthe pronunciation tasks (about 50 ms) thanin the LDT (about 100 ms). Andrews (1982)and Frederiksen and Kroll (1976) have foundsimilar differences between the LDT and thepronunciation task. The unique effect ofLFREQ on pronunciation, however, still is larger

than its 24-ms effect on category verification.LENG' and CDOM have about the same effectin lexical decision and pronunciation. Theregression coefficients for IDOM are noticeablysmaller for the pronunciation tasks than foreither the LDT or category verification task.

The results of the multiple regression anal-yses indicated that the regression coefficientfor LFREQ was highly significant in the word/word conditions for Cycle 1, F(\, 66) = 14.86,p < .001, and Cycle 2, F(\, 66) = 14.23, p <.001, and the word/blank conditions for Cycle1, F(\, 66) = 13.27, p < .001, and Cycle 2,F(1,66)= 10.27, p < .01. Similarly, the LENG'regression coefficients were highly significantin the word/word conditions for Cycle 1, F(l,66) = 21.64, p < .001, and for Cycle 2, F(l,66) = 18.19, p < .001, and in the word/blankcondition for both Cycle 1, F(l, 66) = 6.08,p < .05, and Cycle 2, F(l, 66) = 23.68, p <.001. The only remaining coefficients to reachsignificance in the multiple regression analyseswere those for CDOM for the word/word con-dition, Cycle 2,F(l, 66) = 8.47, p < .01, and

Table 4Regression Coefficients, Raw Correlations, and Partial Correlations for the Pronunciation Experiment

Predictor variable

Cycle

1Regression coefficientrPartial r2

2Regression coefficientrPartial r2

IDOM

-4.36-.45

.01

2.11-.34

.01

LFREQ

Word/word condition

-16.49-.63

.18

-11.45-.60

.18

CDOM

-6.78-.15

.04

-8.01-.17

.11

LENG'

-24.84-.65

.25

-15.86-.66

.22

SYLL'

4.33-.41

.01

-2.51-.50

.01

Word/blank condition

Regression coefficientrPartial r2

2Regression coefficientrPartial r2

-2.55-.40

.01

2.38-.31

.01

-15.66-.60

.17

-10.23-.54

.13

-6.16-.13

.04

-7.54-.17

.09

-12.99-.57

.08

-19.04-.63

.26

-4.75-.46

.01

3.28-.38

.01

Note. IDOM = instance dominance; LFREQ = log word frequency; CDOM = category dominance; LENG' = reflectednumber of letters; SYLL' = reflected number of syllables. The mean RT for the word/word condition, Cycle 1 was492.05 ms (SD = 46.21 ms); for word/word, Cycle 2, it was 457.08 ms (SD = 32.09 ms); for word/blank, Cycle 1,it was 485.66 ms (SD = 41.08 ms); and for word/blank, Cycle 1, it was 461.08 ms (SD = 31.42 ms).

LEXICAL DECISION AND LEXICAL ACCESS 349

for the word/blank condition, Cycle 2, f{l,. 66) = 6.76, p < .05. The R2 values for eachcondition were highly significant, all Fs(5,66) > 13.74, all ps < .001. The values were.60 for word/word, Cycles 1 and 2; .51 forword/blank, Cycle 1; and .54 for word/blank,Cycle 2.

Discussion

The results of the third experiment indicatedthat the pattern of data for the word/word andthe word/blank conditions was virtually iden-tical. That is, in both conditions the length ofthe word and its frequency were strong pre-dictors of pronunciation latencies over andabove their joint effect with the other variables.Furthermore, the only impact of a semanticvariable was the effect of CDOM during Cycle2 and, interestingly, this factor had approxi-mately the same size effect for both the word/word and word/blank conditions. Thus, thesemantic effects being produced in Experi-ments 2 and 3 do not appear to be due simplyto exemplars from the same semantic categorybeing presented.

One possible reason that CDOM predictedpronunciation latency is that both the categoryselection and the pronunciation tasks involvethe same lexical access process, that is, onehas to recognize a word to produce its categoryname. Of course, this could also account forthe CDOM effect found in the results of thelexical decision experiment. In the lexical de-cision experiment, however, IDOM also signif-icantly predicted performance. In this light itis noteworthy that IDOM did not predict pro-nunciation performance in either Cycle 1 orCycle 2 (both Fs < 1). Thus, the IDOM effectfound in the LDT may reflect a postaceessinfluence that is not reflected in pronunciationperformance, consistent with the recent viewsregarding the LDT cited in the introductionto Experiment 3.

It is also important that the LFREQ effectwas the same in the word/word and word/blank conditions. As noted earlier, if implicitsemantic priming is the reason for the absenceof a LFREQ effect in Experiment 1, there shouldhave been a decrease in the LFREQ effect inthe word/word conditions because each se-mantic category was represented by twice asmany words as in the word/blank condition.

In the discussions above, comparisons ofthe sizes of the regression coefficients acrosstasks were made without reference to the sta-tistical reliability of these differences. We nowpresent statistical confirmation of these dif-ferences.

Overall Analysis Section

The analysis procedure presented hereadopts a different perspective in analyzing thedata. A regression analysis was conducted onthe data for each subject. The average regres-sion coefficient for each predictor variable fora given task was then computed, and the vari-ability of the regression coefficient across sub-jects within a task was used to test whetherdifferent tasks requiring lexical access involvedsimilar effects of word frequency. The appro-priate test simply involved conducting ananalysis of variance on the regression coeffi-cients for subjects performing different tasks.Because each experiment used the same setof words and the same set of predictor vari-ables, the intercorrelations among the variableswas a constant, and the task was the equivalentof a variable manipulated orthogonally to thesepredictor variables. Therefore, any change ina regression coefficient from one task to an-other must be due to a change in the way thepredictor variable is related to RT in the tasks.

Because response was a repeated measuresfactor in the category verification task, twosets of analyses of variance were performed.One set incorporated the individual regressioncoefficients from the yes response condition,and the other used the no response data. Eachset of analyses included an analysis for eachof the five predictor variables. The results ofthe analyses for SYLL' will not be reportedbecause, as in all of the analyses reported ear-lier, it did not produce any significant uniqueeffects. The mean regression coefficients fromthese eight analyses are presented in Table 5.The means presented are averaged across cy-cles because none of the interactions of Task XCycle reached significance, all Fs(3,76) < 2.30.

The most important finding displayed inTable 5 is that the effect of LFREQ varied sig-nificantly across the different tasks with .F(3,76) = 4.97, p < .01, standard error of themean (SEM) = 2.82, for the yes analysis, and

350 DAVID A. BALOTA AND JAMES I. CHUMBLEY

Table 5Mean Regression Coefficients Averaged AcrossCycles From the Individual SubjectRegression Analyses

Predictor variable

Condition IDOM LFREQ CDOM LENG'

-10.15 -21.94 -9.22 -15.72

-1.12 -13.97 -7.39 -20.35

-0.08 -12.94 -6.85 -16.01

Lexicaldecision

Pronunciationword/word

Pronunciationword/blank

Categoryverification yes -31.78 -6.59 -45.13 -10.58

Categoryverification no -7.12 -6.74 -5.22 -9.22

Note. IDOM = instance dominance; LFREQ = log wordfrequency; CDOM = category dominance; LENG' = reflectednumber of letters; SYLL' = reflected number of syllables.

F(3, 76) = 7.22, p < .01, SEM = 2.32, for theno analysis. As can be seen, LFREQ had a largeeffect on lexical decision, a moderate effect onpronunciation, and a very small effect on cat-egory verification. Three contrasts confirmedthis observation. The average regression coef-ficient for lexical decision differed from thatfor pronunciation, t(76) = 2.45, p < .02, andthe average regression coefficient for pronun-ciation differed from that for category verifi-cation for both yes, t(76) = 1.99, p < .05, andno responses, t(76) = 2.36, p < .02.

The analyses of variance indicated thatIDOM also had a significant effect across taskswith ^(3, 76) = 44.00, p < .001, SEU = 2.22,for the yes analysis and F(3, 76) = 3.46, p <.05, SEM - 2.59, for the no analysis. Contrastsperformed on the regression coefficients forIDOM indicated that this variable had a greatereffect for category verification yes responsesthan for either lexical decision or category ver-ification no responses (nondirectional ps <.05), but the latter two tasks did not differfrom each other. The effect of IDOM in pro-nunciation was less than that in any of theother tasks (all nondirectional ps < .05). Theanalysis for CDOM yes responses yielded ahighly significant effect, F(3, 76) = 41.68, p <.001, SEM = 2.89, whereas that for the noresponses produced an F < 1, SEM = 2.42.Contrasts confirmed what is fairly obvious:The coefficient for category verification yes

responses significantly differed from all theother coefficients that did not differ from eachother. The analyses with respect to LENG'yielded no significant task effects with bothFs(3,76) < 1.58 (SEM= 3.45 for yes responsesand 3.66 for no responses).

One final analysis was conducted to elim-inate a possible concern of some readers. Onemight argue that the multiple regression anal-yses may have been somehow permitted vari-ables relevant to the category verification task,but not to lexical decision, to mask the effectof frequency in category verification. For ex-ample, IDOM should be more related to thecategory verification task than to the LDT.Also, IDOM is correlated with LFREQ. Thus,according to such an argument, IDOM couldhave accounted for some of the variance inthe category verification task that LFREQ wouldhave otherwise predicted. Fortunately, thereis a simple way (for our data) of addressingthis possibility. We simply need to comparethe raw correlation between LFREQ and cat-egory verification RT to that between LFREQand lexical decision RT. Obviously, these rawcorrelations are not affected by competingvariables. If these correlations are significantlydifferent, then we can be sure that the regres-sion analyses are not producing misleadingresults. A Fisher's Z' test of the difference inthese two raw correlations indicated that the-.30 correlation between LFREQ and categoryverification no response latency was signifi-cantly different from the -.64 correlation be-tween LFREQ and LDT response latency, p <.025. Furthermore, the differences between thecategory verification no correlation and thosefor the pronunciation tasks were also signifi-cant (all nondirectional ps < .054). Thus, theseanalyses rule out the possibility that the presentdifferences across tasks are simply due to theway in which shared variance is partialed inthe regression analyses.

General Discussion

The most important finding in the presentresults is that word frequency was highly re-lated to lexical decision performance over andabove its joint relationship with other variablesbut had little unique relationship to categoryverification performance even though lexicalaccess must be involved in the category ver-

LEXICAL DECISION AND LEXICAL ACCESS 351

ification task. Independent of what one choosesto believe about the relationship between wordfrequency and lexical access, these resultsclearly indicate that word frequency has dra-matically different effects, depending upon thetask that is used to assess lexical access. Thisconclusion has major importance becausecurrent conceptions of word recognition at-tribute a major role to the word frequencyvariable in determining how the mental lex-icon is structured and accessed. Given this rolein lexical access, word frequency should nothave the large task-specific effects we have ob-served.

In this light, there are two other findingsthat are relevant here. First, LFREQ was vir-tually uncorrelated (-.05) with the CDOMvariable in the present study. This finding isnoteworthy, because subjects need to recognizethe word to determine the category of a pre-sented exemplar. Thus, we have yet anotherinstance in which there is little effect of wordfrequency in a task that demands lexical access.

Second, there are two recent studies re-ported by Kliegl, Olson, and Davidson (1982,1983) in which it was found that word fre-quency accounted for only about l%-3% ofthe variance in fixation durations in readingwhen other potentially confounding variables(such as word length) were partialed out. Theseresults are particularly disturbing because theysuggest that the effect of word frequency onfixation duration in reading cannot be un-equivocally attributed to an unconfounded in-dex of frequency.2

Thus, the category verification task is notthe only task that does not produce the large(100 ms) unique frequency effect found in theLDT. In the present study we have been un-successful in our attempts to explain away thelarge reduction in frequency effects in the cat-egory verification task. A different approachis to ask why there is such a large frequencyeffect in the LDT. Because lexical decisionperformance has been a primary source ofdata for the current models of lexical access,it is crucial that one understand all componentsrelevant to the LDT, especially those com-ponents other than lexical access that maythemselves be affected by variables viewed rel-evant to lexical access. A recent example ofan analysis of the components of lexical de-cision has been provided by Morton (1982).

He concludes that at least a portion of theword-frequency effect in lexical decision canbe attributed to the operation of processes inthe cognitive system rather than to logogenthreshold differences produced by differencesin word frequency. Although our analysis wasdeveloped independently of Morton's and as-sumes specific processes about which he mayhave reservations, the basic thrust of our pro-posals is in the same direction.

A Framework for Understanding the LexicalDecision Task

One possible reason that the LDT producessuch large frequency effects is that the taskplaces a premium on frequency informationat the decision stage of the task. Obviously,the fact that lexical decisions are typically morethan twice as long (500-600 ms) as normalreading rates (250 ms/word) suggests that thereis much more involved in the LDT than simplelexical access.3 Given this possibility, it seems

2 There are a number of points to note here. Both Car-penter and Just (1983) and Mitchell and Green (1976)have reported an influence of word frequency on readingtime, measures. In Carpenter and Just's study, the effectwas considerably smaller when length was controlled, and,furthermore, some of their materials contained low-fre-quency words for which subjects did not know the meaning.In Mitchell and Green's study the effect of length was onlypartially controlled for by considering the length of a three-word triad in which the target appeared instead of thelength of the target word itself. Thus, it is unclear what'the true impact of frequency (above and beyond otherpotentially confounding variables) was in these studies.

One final variable should be considered in these morenatural reading situations. That is, it is possible that thecontextual constraints for the high-frequency and low-fre-quency words may vary. For example, it seems that theword water is much more likely in the following contextthan the word tonic.

The cold glass of water quenched the man's thirst.tonic

Such contextual variables are very difficult to tease apartfrom true word-frequency effects on lexical access.

3 Obviously, one could argue that lexical decision la-tencies also involve response execution (pushing the but-ton). In this same light, however, one must acknowledgethe complexity of reading, that is, the reader must programeye movements and integrate the currently accessed wordwith ongoing comprehension. It does not seem reasonablethat simple response execution can account for the largedifferences in latencies between reading and lexical deci-

352 DAVID A. BALOTA AND JAMES I. CHUMBLEY

Fast "Nonword"Responses

RegionRequiringFurther

Analysis

Fast "Word"Responses

LowCriterion

HighCriterion

Low FM Familiarity/Meaningfulness ->. HighFM

Figure 1. Word and nonword distributions along the familiarity/meaningfulness dimension.

necessary to consider more closely the taskfaced by the subject when making a lexicaldecision. Basically, the subject is asked to dis-criminate meaningful stimuli from nonwordletter strings that, with the exceptions of mis-spellings, have never been seen before. Thetwo most obvious pieces of information thesubject could utilize to make such discrimi-nations are the frequency with which the stim-ulus has been seen before and its meaning-fulness. It is important to note here that thefact that frequency information is availabledoes not provide evidence that this informa-tion in some way orders lexical access.

A variant of the two-stage model developedby Atkinson and Juola (1973) for the memorysearch task provides a framework that moreclosely considers the decision process in theLDT. This model is shown in Figure 1. Thebasic notion is that words and nonwords differon a familiarity/meaningfulness (FM) dimen-sion. A particular letter string's value on thisFM dimension is based primarily on its or-thographic and phonological similarity to ac-tual words. The word and nonword distri-butions on the FM dimension are separatedbut overlap. The subject can use this fact infollowing the LDT instructions to both max-imize speed and minimize errors. Becausesome word targets are relatively much morediscriminable from the nonword distractors

(and vice versa), the subject can set two criteriathat will allow rapid decisions for at least someof the stimuli being presented. Thus, a lowcriterion could be set so that very few wordswill have FM values that would fall below thiscriterion. Similarly, a high criterion could beset so that very few nonwords will have FMvalues that would fall above this criterion. Thelocation and utility of these criteria will bedetermined by the similarity between thewords and nonwords within the test list.

The first stage of the decision process in-volves a global computation of the FM valueof the letter string. That is, the subject makesa quick check to determine if the stimulus isproducing any meaning or is very familiar,that is, "Have I seen this stimulus frequently?"If the computed FM value exceeds the uppercriterion, the subject will make a fast wordresponse; if it fails to exceed the lower criterion,the subject will make a fast nonword response.On the other hand, if this FM value falls be-tween the upper and lower criteria, then thesubject needs more information before a de-cision can be made. The necessary informationis obtained by performing a more analyticevaluation of the letter string. For example,the subject may actually need to check thespelling of the letter string against the spellingof a word contained in the subject's lexicon.This extra analysis, of course, requires addi-

LEXICAL DECISION AND LEXICAL ACCESS 353

tional time; thus longer latencies will be foundfor those words and nonwords requiring suchanalysis.

Within the present framework, errors canderive from three different sources. First, errorscould occur in the global analysis when a wordhas an extremely low FM value (e.g., yams)or when a nonword has an exceptionally highFM value (e.g., sondpapaf). Second, errorscould occur in the analytic stage when thereis a lack of knowledge about the appropriatespelling of a word (cf. Gordon, 1983). Third,errors could occur in the analytic stage whenthe subject has established a "criterion timeafter which a guess will be made because thesubject is still unsure about whether the stringis spelled correctly or not. Because we feelerrors can be produced in all three stages anderror rates are generally quite low, we will applythe model primarily to RT data, the data thatmost currently available models of lexical ac-cess have considered to be most important.

The two-stage model we have describedprovides a straightforward account of theword-frequency effect in the LDT. Low-fre-quency words have lower values on the FMdimension than high-frequency words. For thisreason, the global analysis of a low-frequencyword will less often result in a FM value thatexceeds the high criterion and permits a rapidword response. Thus, a large proportion of thelow-frequency words, compared to a smallerproportion of high-frequency words, will re-quire further processing in the analytic stage.The net effect will be that RTs for low-fre-quency words will be, on average, longer thanthose for high-frequency words.

The subject's situation in the category ver-ification task is completely different. A dis-crimination between words and nonwords isnot required, so familiarity information is notused in the same manner. In category verifi-cation, it is clear that the meaningfulness ofa word (concept familiarity) is relevant. Thatis, a category membership judgment is madeabout the "meaningfulness" of associating aparticular word's concept with a category, butfrequency of occurrence in print (word fa-miliarity) is not relevant to the judgment beingmade. Word frequency could affect the timeto determine what is known about a word(lexical access), but this encoding process may

be a very small part of the overall judgmentprocess. Thus, there would be only a very smallunique LFREQ effect. Similarly, in reading,where the primary task is the extraction ofmeaning and not the discrimination betweenwords and nonwords, frequency of occurrenceis not a meaningful dimension for the task athand; thus only a very small unique effect ofword frequency would be expected.

The present framework has the importantfeature that it can account for a number ofresults from the lexical decision literature thathave been problematic for one or more of theavailable models of word recognition. Al-though it is beyond the scope of the presentarticle to address all of this literature, it isuseful to describe briefly how the model ac-counts for some of the most important find-ings.

First, within the present framework whatshould occur if one increases the separationof the word and nonword distributions alongthe FM dimension by lowering the nonworddistribution? This should allow the subject toreduce the upper criterion without increasingthe nonword error rate and should have theeffect of reducing the proportion of words re-quiring the slower analytic check process. Be-cause there are more low-frequency words thanhigh-frequency words with FM values belowthe original upper criterion, the reduced uppercriterion will affect more decisions about low-frequency words than those about high-fre-quency words. The relative reduction in low-frequency word RT should thus be greater thanthat for high-frequency word RT. One way tolower the nonword distribution would be topresent unpronounceable nonwords (stpne) asopposed to pronounceable nonwords (penst).In support of the present analysis both James(1975) and Duchek and Neely (1984) havefound that low-frequency word decisions arefacilitated more than are high-frequency worddecisions by the presence of unpronounceablenonwords.

A second variable that may influence thesubject's placement of the upper and lowercriteria is the frequency blocking manipulation(Glanzer & Ehrenreich, 1979; Gordon, 1983).In studies using this technique, performancewith pure lists (either only high- or only low-frequency words) is compared with perfor-

354 DAVID A. BALOTA AND JAMES I. CHUMBLEY

mance with mixed lists (both high- and low-frequency words). First, consider the pure listsof high-frequency words. Because the difficultdiscrimination between low-frequency wordsand nonwords in a mixed list demands furtheranalysis for many decisions, eliminating thelow-frequency words from the list may en-courage the subject to relax both criteria, thatis, lower the upper criterion and raise the lowercriterion. This relaxation of the criteria wouldreduce latencies both for high-frequency wordsand for nonwords. Both Glanzer and Ehren-reich (1979) and Gordon (1983) reported datafor pure high-frequency lists that match thisprediction. For the pure low-frequency lists,on the other hand, subjects cannot shift theircriteria because the difficult discriminationbetween low-frequency words and nonwordsremains. Glanzer and Ehrenreich found asmall, but nonsignificant, effect of blockingfor the low-frequency words, but Gordonfound a 0-ms difference between mixed andpure lists for low-frequency words, preciselyas the model predicts. Interestingly, Forster(1981) used the pronunciation task that doesnot involve the decision stage and did not finda blocking effect.4

A third important finding in the lexical de-cision literature has been the impact of re-peating some of the words and/or nonwordswithin the experiment. Repeating a word ornonword should have the effect of increasingits FM value. For words, an increase in theFM value should reduce average RT for low-frequency words more than for high-frequencywords because, as indicated earlier, decisionsabout low-frequency words are more likely toneed an analytic check. This is precisely thepattern found by Scarborough, Gerard, andCortese (1979) and Scarborough, Cortese, andScarborough (1977). For nonwords, repetitionshould increase the likelihood that the FMvalue of a repeated nonword will exceed thelower criterion. When this happens, the re-peated nonword latency will be increased be-cause of the need for an analytic check. Du-chek and Neely (1984), Durgunoglu (1982),and McKoon and Ratcliff (1979) have all ob-tained this deleterious effect of repetitions onnonword latencies.5

Repetition is not the only way to increasethe FM value of a word. For example, con-

textual priming should increase FM values.Thus, the FM value of the word cat shouldbe higher when immediately preceded by dogthan when preceded by frog. Such relatednesseffects have been reported in a number of lex-ical decision experiments (e.g., Balota, 1983;Meyer & Schvaneveldt, 1971; Neely, 1977).Within the present framework, the influenceof semantic context should be the greatest forthose words that necessitate the more detailedanalytic check, primarily the low-frequencywords. Evidence in accord with this view hasbeen provided by Becker (1979), who foundthat contextual effects were significantly largerfor low- than for high-frequency words. In asecond highly relevant study, Shulman andDavison (1977) investigated contextual prim-ing and nonword pronounceability. As notedpreviously, one would expect that targets withan unrelated context should be most likely togo through an analytic check. If the uppercriterion has been reduced because unpro-nounceable nonwords are being presented, themajor effect of the criterion shift should beon decision times for targets preceded by un-related words. Shulman and Davison (Exper-iment 1) reported that the change from pro-nounceable nonwords to unpronounceablenonwords produced a reduction in latency of119 ms for the unrelated targets but only 46ms for the related targets.

4 In contrast to Forster's study, it should be noted thatBerry (1971) also used a pronunciation task and found amain effect of blocking and frequency with no evidenceof an interaction. Unfortunately, Berry used only 12 wordsin his study and had pairs of words with the same firstletter in the mixed list but not in the pure list. Becausethe words were presented randomly, it is possible that therewas mispriming in the mixed'list of a particular pronun-ciation by these matched pairs (e.g., above, abet, been,beige, sedate, season). Moreover, 3 of the 6 low-frequencywords were phonologically irregular (abet, beige, sedate),and these three items showed the largest blocking effect.In fairness to Berry, the blocking effect was not of primaryinterest in his study.

5 Scarborough, Gerard, and Cortese (1979) actuallyfound a slight facilitation for nonword repetitions at shortlags. However, these repetition effects disappeared at longerlags. Nonword repetition effects at short lags may be re-sponse priming effects. The nonword repetition inhibitioneffects reported by Duchek and Neely (1984), Durgunoglu(1982), and McKoon and Ratcliff (1979) occur when thenonwords are presented in an earlier list of materials andthen later are presented in an LDT.

LEXICAL DECISION AND LEXICAL ACCESS 355

This brief discussion of some of the lexicaldecision literature bearing critically on theframework we have proposed is, of course, onlythe first step in testing the adequacy of themodel. However, at this stage, the model doesappear to be a useful alternative to other view-points because it can provide new insights intofindings from lexical decision experiments thathave been troublesome for dominant theoriesof word recognition. The model says nothingabout the lexical access process and, in thatsense, is relatively uninteresting. Its contri-bution lies in helping to understand the com-plexities of the LDT and in evaluating therelevance of results from the LDT to under-standing the lexical access process. A full-scaledevelopment and quantitative evaluation of themodel can be pursued, but the continued im-portance of the LDT to the study of wordrecognition and memory will determine theimportance of such a pursuit. We believe thatthe LDT is a useful tool for the study of manykinds of questions, for example, the study ofpriming effects, and we expect that many re-searchers agree with us. This being the case,one can expect that the model will be sub-mitted to rigorous empirical test and theo-retical analysis.

Conclusions

Three experiments have been reported thatinvestigated the impact of the same five vari-ables for the same set of words across threedifferent tasks. The results indicated that therewere striking differences in the unique effectof word frequency across the tasks even thougheach of the tasks should involve a similar lex-ical access process. Particularly dramatic wasthe large unique influence of word frequencyin the LDT and its negligible unique effect onthe RT for no responses in the category ver-ification task. It was argued that the word-frequency effect may be exaggerated in theLDT because of the importance of familiarityof the stimulus in discriminating targets fromdistractors. A simple two-stage model of theLDT was described that adequately accountsfor a number of LDT results that are prob-lematic for most currently dominant modelsof lexical access.

It is important to reiterate here that we arenot arguing that word frequency has no impacton lexical access. Our data do not support thatconclusion, and our model takes no positionwith respect to such a claim. There are datafrom other tasks that have been taken as ev-idence that word frequency has a strong impacton lexical access. Two such tasks are tachis-toscopic word recognition (Broadbent, 1967;Broadbent & Broadbent, 1975) and pronun-ciation (Andrews, 1982; Frederiksen & l&oll,1976). However, there are also problems withthe evidence from both of these tasks. First,some researchers (e.g., Catlin, 1969, 1973) be-lieve that the effect of word frequency foundin tachistoscopic studies is a response selectioneffect and not a lexical access effect. Second,the threshold word recognition task providesdata relevant to the study of the effect of wordfrequency on probability of reporting a wordfrom a given frequency class, and these dataare not necessarily relevant to "speed" of lex-ical access, the issue addressed by most currentmodels. Third, there is now mounting evidence(see Balota & Chumbley, in press; Theios &Muise, 1977) suggesting that the word-fre-quency effect in the pronunciation task stems,at least in part, from the production stage ofpronunciation rather than only the lexical ac-cess stage.

In sum, the major thrust of the present ar-ticle has been to point out the importance ofconsidering the decision task the subject facesin making lexical decisions. Unless one con-siders the decision stage, the results from theLDT may misdirect the development of anadequate model of lexical access. In this light,the task confronting those interested in wordrecognition is either to (a) demonstrate therelevance of lexical decision results to lexicalaccess while taking into consideration the de-cision stage or (b) develop a different task thatmore faithfully reflects the processes involvedin word recognition.

References

Anderson, J. R., & Reder, L. M. (1974). Negative judgmentsin and about semantic memory. Journal of VerbalLearning and Verbal Behavior, 13, 664-681.

Andrews, S. (1982). Phonological receding: Is the regularityeffect consistent? Memory & Cognition, 10, 565-575.

Atkinson, R. C., & Juola, J. F. (1973). Factors influencing

356 DAVID A. BALOTA AND JAMES I. CHUMBLEY

speed and accuracy of word recognition. In S. Kornblum(Ed.), Attention and performance IV(pp. 583-612). NewYork: Academic Press.

Balota, D. A. (1983). Automatic semantic activation andepisodic memory encoding. Journal of Verbal Learningand Verbal Behavior, 22, 88-104.

Balota, D. A., & Chumbley, J. I. (in press). The locus ofword-frequency effects in the pronunciation task: Lexicalaccess and/or production frequency? Journal of VerbalLearning and Verbal Behavior.

Battig, W. E, & Montague, W. E. (1969). Category normsfor verbal items in 56 categories: A replication and ex-tension of the Connecticut category norms. Journal ofExperimental Psychology Monograph, 80,(3, Pt. 2).

Becker, C. A. (1976), Allocation of attention during visualword recognition. Journal of Experimental Psychology:Human Perception and Performance, 2, 556-566.

Becker, C. A. (1979). Semantic context and word frequencyeffects in visual word recognition. Journal of Experi-mental Psychology: Human Perception and Performance,5, 252-259.

Becker, C. A. (1980). Semantic context effects in visualword recognition: An analysis of semantic strategies.Memory & Cognition, 8, 493-512.

Becker, C. A., & Killion, T. H. (1977). Interaction of visualand cognitive effects in word recognition. Journal ofExperimental Psychology: Human Perception and Per-formance, 3, 389-401.

Berry, C. (1971). Advanced frequency information andverbal response times. Psychonomic Science, 21, 151-152.

Broadbent, D. E. (1967). Word-frequency effect and re-sponse bias. Psychological Review, 74, 1-15.

Broadbent, D. E., & Broadbent, M. H. P. (1975). Somefurther data concerning the word-frequency effect.Journal of Experimental of Psychology: General, 104,297-308.

Carpenter, P. A., & Just, M. A. (1983). What your eyesdo while your mind is reading. In K. Rayner (Ed.), Eyemovements in reading: Perceptual and language processes(pp. 275-307). New York: Academic Press.

Catlin, J. (1969). On the word-frequency effect. Psycho-logical Review, 76, 504-506.

Catlin, J. (1973). In defense of sophisticated-guessing the-ory. Psychological Review, 80, 412-416.

Chumbley, J. I. (1984), The roles of typicality, instancedominance, and category dominance in verifying categorymembership. Manuscript in preparation.

Chumbley, J. I., & Balota, D. A. (1984). A word's meaningaffects the decision in lexical decision. Manuscript sub-mitted for publication.

Coltheart, M., Davelaar, E., Jonasson, J. X, & Besner, D.(1977). Access to the internal lexicon. In S. Dornic (Ed.),Attention and performance VI (pp. 535-555). New York:Academic Press.

Duchek, J. M., & Neely, J. H. (1984). Levels-of-processingand frequency effects in episodic and semantic memory.Manuscript in preparation.

Durgunoglu, A. (1982). Episodic and semantic primingin episodic and semantic memory. Unpublished master'sthesis. Purdue University.

Forster, K. I. (1976). Accessing the mental lexicon. InR. J. Wales & E. Walker (Eds.), New approaches to

language mechanisms (pp. 257-287). Amsterdam:North-Holland.

Forster, K. I. (1979). Levels of processing and the structureof the language processor. In W. E. Cooper & E. Walker(Eds.), Sentence processing: Psycholinguistic studiespresented to Merrill Garrett (pp. 27-85). Hillsdale, NJ:Erlbaum.

Forster, K. I. (1981). Frequency blocking and lexical access:One mental lexicon or two? Journal of Verbal Learningand Verbal Behavior, 20, 190-203.

Frederiksen, J. R., & Kroll, J. F. (1976). Spelling andsound: Approaches to the internal lexicon. Journal ofExperimental Psychology: Human Perception and Per-formance, 2, 361-379.

Glanzer, M., & Ehrenreich, S. L. (1979). Structure andsearch of the internal lexicon. Journal of Verbal Learningand Verbal Behavior, 18, 381-398.

Gordon, B. (1983). Lexical access and lexical decision:Mechanisms of frequency sensitivity. Journal of VerbalLearning and Verbal Behavior, 22, 24-44.

James, C. T. (1975). The role of semantic information inlexical decisions. Journal of Experimental Psychology:Human Perception and Performance, 1, 130-136.

Kliegl, R., Olson, R. K., & Davidson, B. J. (1982). Regres-sion analyses as a tool for studying reading processes:Comment on Just and Carpenter's eye fixation theory.Memory & Cognition, 10, 287-296.

Kliegl, R., Olson, R. K., & Davidson, B. J. (1983). Onproblems of unconfounding perceptual and languageprocesses. In K. Rayner (Ed.), Eye movements in reading:Perceptual and language processes (pp. 333-343). NewYork: Academic Press.

Kucera, H., & Francis, W. N. (1967). Computational anal-ysis of present-day American English. Providence, RI:Brown University Press.

Loftus, E. F, & Suppes, P. (1972). Structural variablesthat determine the speed of retrieving words from long-term memory. Journal of Verbal Learning and VerbalBehavior, 11, 770-777.

McKoon, G., & Ratcliff, R. (1979). Priming in episodicand semantic memory. Journal of Verbal Learning andVerbal Behavior, 18, 463-480.

Meyer, D. E., & Schvaneveldt, R. W. (1971). Facilitationin recognizing pairs of words: Evidence of a dependencebetween retrieval operations. Journal of ExperimentalPsychology, 90, 227-234.

Millward, R. B., Rice, G., & Corbett, A. (1975). Categoryproduction measures and verification times. In R. A.Kennedy and R. W. Wilkes (Eds.), Studies in long-termmemory (pp. 219-252). New York: Wiley.

Mitchell, D. C., & Green, D. W. (1978). The effects ofcontext and content on immediate processing in reading.Quarterly Journal of Experimental Psychology, 30, 609-636.

Morton, J. (1969). The interaction of information in wordrecognition. Psychological Review, 76, 165-178.

Morton, J. (1970). A functional model for memory. InD. A. Norman (Ed.), Models of human memory (pp.203-254). New York: Academic Press.

Morton, J. (1982). Disintegrating the lexicon: An infor-mation processing approach. In J. Mehler, S. Franck,E. C. T. Walker, & M. Garrett (Eds.), Perspectives on

LEXICAL DECISION AND LEXICAL ACCESS 357

mental representation (pp. 89-109). Hillsdale, NJ: Erl- Shulman, H. G., & Davison, T. C. B. (1977). Controlbaum. properties of semantic coding in a lexical decision task.

Neely, J. H. (1977). Semantic priming and retrieval from Journal of Verbal Learning and Verbal Behavior, 16,lexical memory: Roles of inhibitionless spreading ac- 91-98.tivation and limited capacity attention. Journal of Ex- Theios, J., & Muise, J. O. (1977). The word identificationperimental Psychology: General, 106, 226-254. process in reading. In N. J. Castellan, D. B. Pisoni, &

Rosen, E. (1975). Cognitive representations of semantic G. R. Potts (Eds.), Cognitive theory. (Vol. 2, pp. 289-categories. Journal of Experimental Psychology: General, 327). Hillsdale, NJ: Erlbajim.104, 192-203. West, R. E, & Stanovich, K. E. (1982). Source of inhibition

Sanford, A. J., & Seymour, P. H. K. (1974). Semantic in experiments on the effect of sentence context ondistance effects in naming superordinates. Memory & word recognition. Journal of Experimental Psychology:Cognition, 2, 714-720. Human Learning, Memory, and Cognition, 8, 385-399.

Scarborough, D. L., Cortese, C, & Scarborough, H. (1977). Whaley, C. P. (1978). Word-nonword classification time.Frequency and repetition effects in lexical memory. Journal of Verbal Learning and Verbal Behavior, 17,Journal of Experimental Psychology: Human Perception 143-154.and Performance, 3, 1-17.

Scarborough, D. L., Gerard, L., & Cortese, C. (1979).Accessing lexical memory: The transfer of word repe-tition effects across task and modality. Memory & Cog- Received July 29, 1983nition, 7, 3-12. Revision received December 5, 1983 •

Instructions to Authors

Authors should prepare manuscripts according to the Publication Manual of the-AmericanPsychological Association (3rd ed.). All manuscripts must include an abstract of 100-150 wordstyped on a separate sheet of paper. Typing instructions (all copy must be double-spaced) andinstructions on preparing tables, figures, references, metrics, and abstracts appear in the Manual.Also, all manuscripts are subject to editing for sexist language. For further information oncontent, authors should refer to the editorial in the February 1982 issue of the Journal (Vol.8, No. 1, pp. 171-172). For information on the other three JEP journals, authors should referto editorials in those journals.

APA policy prohibits an author from submitting the same manuscript for concurrent con-sideration by two or more journals. APA policy also prohibits duplicate publication, that is,publication of a manuscript that has already been published in whole or in substantial part inanother publication. Authors of manuscripts submitted to APA journals are expected to haveavailable their raw data throughout the editorial review process and for at least 5 years afterthe date of publication.

Blind reviews are optional, and authors who wish blind reviews must specifically requestthem when submitting their manuscripts. Each copy of a manuscript to be blind reviewedshould include a separate title page with authors' names and affiliations, and these should notappear anywhere else on the manuscript. Footnotes that identify the authors should be typedon a separate page. Authors should make every effort to see that the manuscript itself containsno clues to their identities.

Manuscripts should be submitted in quintuplicate (the original and four photocopies), andall copies should be clear, readable, and on paper of good quality. Authors should keep a copyof the manuscript to guard against loss. Mail manuscripts to the Editor, William Epstein,Department of Psychology, University of Wisconsin, W. J. Brogden Psychology Building, 1202West Johnson Street, Madison, Wisconsin 53706.

For the other JEP journals, authors should submit manuscripts (in quadruplicate) to one ofthe editors at the following addresses: Journal of Experimental Psychology: General, SamGlucksberg, Department of Psychology, Princeton University, Princeton, New Jersey 08544;Journal of Experimental Psychology: Learning, Memory, and Cognition, Richard M. Shiffrin,Department of Psychology, Indiana University, Bloomington, Indiana 47405; and Journal ofExperimental Psychology: Animal Behavior Processes, Donald S. Blough, Department of Psy-chology, Brown University, Providence, Rhode Island 02912. When one of the editors believesa manuscript is clearly more appropriate for an alternative journal of the American PsychologicalAssociation, the editor may redirect the manuscript with the approval of the author. .