23
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=plcp21 Download by: [University Of Maryland] Date: 16 November 2017, At: 05:32 Language, Cognition and Neuroscience ISSN: 2327-3798 (Print) 2327-3801 (Online) Journal homepage: http://www.tandfonline.com/loi/plcp21 The time course of contextual cohort effects in auditory processing of category-ambiguous words: MEG evidence for a single “clash” as noun or verb Phoebe Gaston & Alec Marantz To cite this article: Phoebe Gaston & Alec Marantz (2017): The time course of contextual cohort effects in auditory processing of category-ambiguous words: MEG evidence for a single “clash” as noun or verb, Language, Cognition and Neuroscience, DOI: 10.1080/23273798.2017.1395466 To link to this article: http://dx.doi.org/10.1080/23273798.2017.1395466 View supplementary material Published online: 01 Nov 2017. Submit your article to this journal Article views: 31 View related articles View Crossmark data

The time course of contextual cohort effects in auditory ...ling.umd.edu/assets/publications/Gaston-Marantz-17-Contextual... · homophones, which may have special representational

  • Upload
    hadang

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

Full Terms & Conditions of access and use can be found athttp://www.tandfonline.com/action/journalInformation?journalCode=plcp21

Download by: [University Of Maryland] Date: 16 November 2017, At: 05:32

Language, Cognition and Neuroscience

ISSN: 2327-3798 (Print) 2327-3801 (Online) Journal homepage: http://www.tandfonline.com/loi/plcp21

The time course of contextual cohort effects inauditory processing of category-ambiguous words:MEG evidence for a single “clash” as noun or verb

Phoebe Gaston & Alec Marantz

To cite this article: Phoebe Gaston & Alec Marantz (2017): The time course of contextual cohorteffects in auditory processing of category-ambiguous words: MEG evidence for a single “clash” asnoun or verb, Language, Cognition and Neuroscience, DOI: 10.1080/23273798.2017.1395466

To link to this article: http://dx.doi.org/10.1080/23273798.2017.1395466

View supplementary material

Published online: 01 Nov 2017.

Submit your article to this journal

Article views: 31

View related articles

View Crossmark data

The time course of contextual cohort effects in auditory processing of category-ambiguous words: MEG evidence for a single “clash” as noun or verbPhoebe Gaston a,b* and Alec Marantza,b,c

aDepartment of Linguistics, New York University, New York, NY, USA; bNYUAD Institute, New York University Abu Dhabi, Abu Dhabi, United ArabEmirates; cDepartment of Psychology, New York University, New York, NY, USA

ABSTRACTThe size and probability distribution of a word-form’s cohort of lexical competitors influence auditoryprocessing and can be constrained by syntactic category information. This experiment employsnoun/verb homonyms (e.g. “ache”) presented in syntactic context to clarify the mechanisms andrepresentations involved in context-based cohort restriction. Implications for theories positingsingle versus multiple word-forms in cases of category ambiguity also arise. Using correlationsbetween neural activity in auditory cortex, measured by magnetoencephalography (MEG), andstandard and context-dependent cohort entropy and phoneme surprisal variables, we considerthe possibility of cohort restriction on the basis of form or on the basis of category usage.Crucially, the form-conditional measure is consistent only with a single word-form view ofcategory ambiguity. Our results show that noun/verb homonyms are derived from singlecategory-neutral word-forms and that the cohort is restricted incrementally in context, by formand then by usage.

ARTICLE HISTORYReceived 20 September 2016Accepted 18 September 2017

KEYWORDSAuditory cohort; syntacticcategory; entropy; surprisal;word recognition

Introduction

Auditory word recognition

Debates concerning the role of context in word recog-nition have long animated research in auditory languageprocessing, and many revolve around the timing of anypotential contextual influence. Word recognitionmodels typically recognise stages that include access oflexical candidates based on incoming sensory(“bottom-up”) input, selection among competing candi-dates once sufficient (bottom-up or “top-down”) infor-mation is available, and integration with existingstructures, though implementations vary (see Frauen-felder and Tyler (1987), Dahan and Magnuson (2006),and McQueen (2007) for overviews).

For the purposes of this paper, we will assume thatany lexical candidate consistent with current auditoryinput is activated and remains so unless invalidatedby additional bottom-up information, as is broadlytrue in Marslen-Wilson’s Cohort models (Gaskell &Marslen-Wilson, 1997; Marslen-Wilson, 1980, 1987;Marslen-Wilson & Welsh, 1978) and McClelland andElman’s (1986) TRACE model. This makes the phonetic/phonological overlap between the perceived inputand the onsets of potential word-forms in the lexicon

the primary criterion for membership in the “cohort,”a term which here and forthcoming we will use torefer simply to the set of activated candidates and notto any specific instantiation of Marslen-Wilson’s propo-sals. Acknowledging that there is empirical evidenceboth for and against the alternatives discussed inother frameworks (e.g. Shortlist (Norris, 1994; Norris &McQueen, 2008) and the Neighbourhood ActivationModel (Luce, 1986; Luce & Pisoni, 1998; Luce, Pisoni, &Goldinger, 1990)), we will leave consideration ofadditional possibilities to future work in order to focuson the question at hand: whether (and if so, how andwhen) top-down contextual input in addition tobottom-up auditory input can be used to reduce thecohort. We will survey previous work addressing thisissue, and then describe the new approach weemployed to clarify the process of cohort restrictionon the basis of grammatical category, as well as the rep-resentations that are involved.

Context effects

Gating & cross-modal primingResearch using gating or cross-modal priming over-whelmingly shows that in the initial 100–200 ms after

© 2017 Informa UK Limited, trading as Taylor & Francis Group

Supplemental data for this article can be accessed at doi:10.1080/23273798.2017.1395466.

CONTACT Phoebe Gaston [email protected]

*Present address: Department of Linguistics, University of Maryland, College Park, MD, USA.

LANGUAGE, COGNITION AND NEUROSCIENCE, 2017https://doi.org/10.1080/23273798.2017.1395466

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

phonetic information becomes available, the set oflexical candidates under consideration is not con-strained by sentential or pragmatic information. Earlycross-modal priming studies focused largely on thepotential for suppression of contextually inappropriatemeanings of homophones. Tanenhaus, Leiman, and Sei-denberg (1979) showed facilitation of both contextuallyappropriate and inappropriate meanings when a targetword immediately followed a homophone, but only theappropriate meaning when probed two hundred milli-seconds later. This lack of contextual influence wasthen confirmed with more carefully restrictive syntacticcontexts (Tanenhaus & Donnenwerth-Nolan, 1984)and in the face of purely pragmatic constraints(Swinney, 1979).

Research bearing on the autonomy of bottom-upcandidate activation is not however restricted tohomophones, which may have special representationalproperties. Grosjean (1980) introduced the gating para-digm, in which increasingly longer word fragments arepresented to listeners for identification, and showedthat words could be first identified up to 200 msmore quickly in a semantically constraining contextthan in isolation. Tyler and Wessels (1983) later foundan advantage of only roughly 30 ms in syntacticallyconstraining context, and attribute this to the factthat even strong syntactic constraints can leave manypossible competitors, while a strong semantic con-straint can eliminate all but few or even one. Tyler(1984) analyzes the responses produced in this samedataset and shows that contextually inappropriate can-didates are indeed produced at initial gates, in align-ment with the previously described findings forhomophone meanings and in support of cohort gener-ation on the basis of bottom-up information alone. Thecase for initial autonomy was demonstrated perhapsmost famously by Zwitserlood (1989), who usedcross-modal priming at a set of informative time-points determined by the gating paradigm, in seman-tically biasing and neutral contexts. Neither a semanticassociate of the contextually appropriate prime nor anassociate of a contextually inappropriate cohort com-petitor was found to be facilitated before auditoryonset of the prime, refuting models that allow pre-acti-vation in predictable contexts, while both were facili-tated at the first two gates (mean 130 and 199 ms),regardless of context. Gating and cross-modalpriming therefore seemed to converge in support ofthe autonomy of candidate generation during wordrecognition (as reviewed by Swinney and Love(2002)), granted that the timing of the earliest effectsis not consistent. It is not clear whether this inconsis-tency is due to differences in the nature of the

contextual constraints on the stimuli, differences inthe paradigms used, or other factors.

Several subsequent studies, however, have used thesame methodologies and produced evidence thatdiverges from the conclusion of autonomy of candidategeneration, suggesting that immediate context effectsare possible in some circumstances. Shillcock and Bard(1993), for example, use cross-modal priming tomeasure facilitation of homophone meanings in syntac-tic context when the homophonous pair is comprised ofan open-class and a closed-class word (e.g. “would”/“wood”). Unlike in previous studies on homophones,they find facilitation only for the contextually appropriatemeaning. McAllister (1988) presents a gating experimentwith the addition of a no-context control condition,intended to question the conclusion of Tyler (1984)that the production of syntactically inappropriateresponses at very early gates in context indicates uncon-strained candidate generation. This work shows that, atthe first gate of 50 ms, significantly fewer inappropriateresponses are produced in a short context as comparedto a no context condition, and in a long context as com-pared to short context condition. Early stages of accessare therefore argued to be not completely autonomous.Finally Grosjean, Dommergues, Cornu, Guillelmon, andBesson (1994) test grammatical gender as a contextualconstraint in gating, under the hypothesis that typicallyemployed syntactic contexts are not, in fact, verystrongly constraining. They show that candidates pro-duced at early gates are never of the wrong gender;this is highly unexpected if context cannot constrainthe word-initial cohort.

Eye-trackingDahan, Swingley, Tanenhaus, and Magnuson (2000) testthe same phenomenon with eye-tracking in a visual-world paradigm, using gender-marked definite articlesin French as their constraining context. They show thatfixations to gender-mismatching cohort competitorsare indistinguishable from fixations to distractors whichare not cohort competitors, both of which diverge fromfixations to the target roughly 250 ms after word onset.This is in line with the findings of Grosjean et al. (1994),indicating that a context of grammatical gender canprevent gender-mismatching would-be cohort competi-tors from competing in the first place.

Dahan and Tanenhaus (2004) then use Dutch to test astrong semantic constraint imposed by verbs on subjectnouns that follow, and show, similarly, that cohort com-petitors do not compete with the target when presentedin a biasing context that rules them out.

Finally, Magnuson, Tanenhaus, and Aslin (2008)present converging evidence from an artificial

2 P. GASTON AND A. MARANTZ

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

language designed in such a way that pragmatic con-straints could create strong expectations for color-denoting adjectives or shape-denoting nouns. Likethe previous studies, they found that phonologicalcohort competitors were fixated no more often thandistractors when the visual context made them prag-matically unlikely.

Unresolved issuesEvidence from eye-tracking therefore points stronglynot only toward a framework of continuous inte-gration of contextual information but also toward con-straint on initial access. These studies never find apreference for cohort competitors over distractors inconstraining context, despite the prediction, on thebasis of gating and cross-modal priming, that auton-omous initial access should occur. To what extent dothese bodies of work, taken together, contradicteach other?

One problem in integrating these findings is that thetype and strength of contextual constraint varies signifi-cantly; for some cases in which early contextual effectsare not found, it may be that the context was simplynot sufficiently constraining. A design property thatthe three paradigms have in common poses anadditional challenge to interpreting the results: behav-iour with respect to an extremely small set of items istaken to be indicative of the status of the cohort as awhole. Finally, there is uncertainty regarding the rep-resentational status of homophones in an auditorycohort, such that is it unclear whether the behaviourof a contextually inappropriate meaning of a homo-phone should have exactly the same implications asthe behaviour of a contextually inappropriate cohortcompetitor. This would only be the case if there areredundant auditory word-forms in the cohort for homo-phones, which is certainly not out of the question: Gahl(2008), for example, has shown differential pronuncia-tion times for homophone pairs like “time”/“thyme” inwhich one member is substantially more frequent. Butif both meanings are associated with a single, sharedauditory word-form, questions regarding context’s influ-ence on initial access to that single word-form and themeanings associated with it are somewhat morecomplicated.

In the following sections we will describe an exper-imental approach that has the potential to mitigatesome of these concerns, using tightly constrained syn-tactic contexts, directly addressing one aspect of thehomophone question, and employing a measure thatreflects the state of the cohort in its entirety, with hightemporal resolution. The autonomy of candidate gener-ation remains a complicated question, but this approach

should provide a clearer and more complete picture ofthe manner and time-frame in which a cohort can beconstrained.

Cohort measures

After making any necessary assumptions about whatshould be included in a cohort, it is possible to calculatea variety of measures, such as neighbourhood density(e.g. Magnuson, Dixon, Tanenhaus, and Aslin (2007)),that reflect its dynamics over the course of a word. Fur-thermore, we can use information about the relative fre-quencies of competitors to calculate, at each phoneme,measures like entropy (Moscoso del Prado Martín,Kostić, & Baayen, 2004; Shannon, 1948) and surprisal(Hale, 2001), which are related but distinct in the cogni-tive processes they have been hypothesised to index.Recent work applying these concepts at the phonemelevel has shown them both to correlate with dependentmeasures like reaction time (Baayen, Wurm, & Aycock,2007; Balling & Baayen, 2012; Kemps, Wurm, Ernestus,Schreuder, & Baayen, 2005; Wurm, Ernestus, Schreuder,& Baayen, 2006) and neural activity in auditory cortex(Ettinger, Linzen, & Marantz, 2014; Gagnepain, Henson,& Davis, 2012; Gwilliams & Marantz, 2015) duringlanguage processing.

Cohort entropy is a measure of uncertainty in theprobability distribution of cohort competitors that areconsistent with auditory input at any given point. Acohort comprised of two possibilities whose probabil-ities, based on frequency, are exactly equal would havea very high entropy, since these two cohort membersare equally likely continuations. To calculate entropy,each competitor’s frequency is divided by the summedfrequency of the total cohort, and this proportion is mul-tiplied by its log. The cohort entropy is then the negativesum of these values. This measure has been shown tohave both facilitatory (Baayen et al., 2007; Ettingeret al., 2014; Wurm et al., 2006) and inhibitory (Kempset al., 2005; Wurm et al., 2006) effects, depending onthe point in a word at which it is measured.

Surprisal can be described as a measure of infor-mation gain resulting from candidates being ruled outby a new segment of input, or as a measure of the prob-ability of that incoming segment given the predictions ofthe preceding input (Balling & Baayen, 2012; Ettingeret al., 2014; Gwilliams & Marantz, 2015; Hale, 2001; seealso Gagnepain et al., 2012). Phoneme surprisal is calcu-lated as the inverse of the log of a phoneme’s conditionalprobability given the phonemes preceding it (while con-ditional probability is the overall frequency of the cohortof the current phoneme divided by the overall frequencyof the cohort of the immediately preceding phoneme). In

LANGUAGE, COGNITION AND NEUROSCIENCE 3

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

the neural research that employs it as an independentmeasure (Ettinger et al., 2014; Gwilliams & Marantz,2015), increased surprisal has been associated withincreased neural activity (an inhibitory effect). Its effectshave also been found both early and late in the word.Gagnepain et al.’s (2012) effect is not for phoneme sur-prisal, exactly, but for a closely related variable they callprediction error.

Since these formulas require assumptions about whatcandidates should be included in the cohort and aboutthe frequency of usage of any given item, we will demon-strate that the makeup of the designated cohort that isinput to these formulas can be used to test hypothesesabout the makeup of the auditory cohort under studyin the human brain (specifically, whether it is con-strained, and when). For example, Strand, Simenstad,Cooperman, and Rowe (2014) calculate a measure ofneighbourhood density within a word’s grammaticalclass (“grammatical density”), and find that for targetsobscured by noise at the end of a syntactically constrain-ing sentence, lower grammatical density improvesidentification as well as lexical decision accuracy. Thissuggests the relevance of a potentially category-specificcohort in the recognition process, indexed with a cat-egory-specific density measure.

Grammatical category

Probing the nature of a constrained set of candidates inthis way requires translating from properties of thecontext to demands on the cohort. What, specifically,should we expect different aspects of the context torule in or out? This question is particularly difficult inthe case of semantically constraining contexts, sincethey can only make a competitor more or less likely orsurprising. Syntactically constraining contexts, on theother hand, can impose a restriction whose violationwould be grammatically unacceptable. One such con-straint would be an expectation for a specific grammati-cal category, which many of the previously mentionedexperiments on syntactic context have investigated.Restrictions on the resulting character of the cohort arethen very clear: infinitival “to” makes a strong predictionfor a verb to follow, while determiner “the” makes astrong prediction for a noun. These predictions are not,of course, perfect, as an adverb or an adjective mightintervene. But at the very least these contexts can ruleout a grammatical category, since a noun cannot followinfinitival “to,” and a verb cannot follow determiner“the.” If one hopes to probe the mechanisms of contex-tual constraint, as we aim to do with continuous cohortvariables, restrictions derived from grammatical categoryare the simplest place to start.

This type of investigation should be informed byresearch on the nature of category representation itself,since specific representational qualities will havebearing on the ways in which category can realisticallybe employed as a constraint. Recent evidence fromfMRI supports the Combinatorial view that differencesin the neural responses to nouns and verbs in contextare related to their usage as such rather than to differ-ences in fundamental lexical properties (Tyler, Randall,& Stamatakis, 2008). This is supported by evidence thatsemantic confounds can readily explain noun/verb pro-cessing differences that have previously been found insingle-word presentation (Moseley & Pulvermüller,2014). A review across methodologies and paradigms(Vigliocco, Vinson, Druks, Barber, & Cappa, 2011) similarlyconcludes that differences in the processing of nounsand verbs emerge only in syntactic context, counter toLexicalist theories of grammatical class as a represen-tational feature.

These studies point to a conception of category lessalong the lines of feature assignment and more akin toaffixation (see also King, Linzen, & Marantz (to appear)on this idea), by which a single category-less root (e.g.“clash”) can be derived, depending on the context, toform a noun or a verb, rather than requiring a separateentry in the lexicon for each possible category. This is rel-evant for auditory word recognition because a single-root hypothesis for category homonymy in the lexiconimplies a single phonological word-form for categoryhomonymy in the auditory cohort. We will use theterm “category homonymy” to refer to exactly this situ-ation in which an auditory word-form associated with asingle root semantic representation can be employedin more than one grammatical category. We distinguishthis from other types of homonymy in which a singleauditory word-form is associated with more than onesemantic representation, and may or may not beemployed in more than one grammatical category.Each area of potential redundancy in a lexical represen-tation gives rise to its own set of questions withrespect to encoding; in this paper, because we are inves-tigating the effects of category restrictions imposed bycontext, we will be specifically concerned with whethermultiple categories can be associated with a singleword-form in the auditory cohort.

Under the single word-form view, /k/ as input wouldactivate a single cohort member for a word like “clash,”whose frequency would be the overall frequency ofthat phonological form across the language (irrespectiveof its category-specific usages). A multiple word-formshypothesis, in contrast, would require that the nounand verb forms of “clash” be activated independently,with each contributing separately to the probability

4 P. GASTON AND A. MARANTZ

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

distribution of the cohort. The single and multiple word-form possibilities make even more divergent predictionsabout the behaviour of the cohort if it is restricted by cat-egory requirements in syntactic context. If, for example,/k/ is preceded by “the,” restriction to forms with compa-tible category designations would remove the verb formof “clash” from contention in the multiple word-formsscenario. In the single word-form scenario, however,the single ambiguous form of “clash” in the cohortwould not be affected by this kind of restriction,because “clash” does have a noun usage that makes itsform compatible with the context. Another plausiblestep, after restriction of the forms that make up thecohort, would weight the probability of those formsbased on their frequency of usage in the restrictivecontext. The probability distribution of the cohortwould therefore shift because each form would contrib-ute only category-specific frequencies. This step wouldhave no detectable effect in the case of multiple word-forms, which already are associated with only the fre-quency of their specific category of usage.

Because the cohort entropy and phoneme surprisal ofa given phoneme are derived from the makeup andprobability distribution of the cohort, these variablesstand to be altered significantly depending on whetherthere are single or multiple word-forms for homonymsin the cohort and whether contextual category infor-mation has the result of constraining forms, frequencies,or both. See Figure 1 for a visual representation of anexample cohort, showing updating of member forms inresponse to an incoming phoneme, along with theadditional restrictions in form or frequency afforded bycategory information. For simplicity, the three potentialsurprisal measures in this figure are shown with respectto a single preceding cohort, which is unrestricted.

Current question

This experiment aims to shed light on both the nature ofcategory ambiguity in auditory processing and thebehaviour of the auditory linguistic cohort in contextby measuring neural responses to category-ambiguousstems in minimal phrases (e.g. “to ache badly,” “theclash persisted”), using magnetoencephalography(MEG). For the purposes of this experiment, we restrictour consideration to noun/verb ambiguity, specifically.We use the same category-neutral cohort entropy andphoneme surprisal variables found by Ettinger et al.(2014) to correlate with neural activity, along with newconditional (category-specific) versions meant to reflectthe different possibilities for the makeup of a restrictedcohort. The aim is to investigate whether the pattern ofcorrelations between neural activity and these cohort

measures supports a single or multiple word-formshypothesis for category ambiguity, and to clarify whenand how cohort restriction takes place in syntacticcontext.

The first set of conditional measures, called form-con-ditional for these purposes, was calculated by restrictingthe cohort from which entropy and surprisal are derivedto forms tagged in the corpus (SUBTLEX-US; Brysbaert &New, 2009) with the correct category for the context,regardless of whether that category was the sole orprimary usage for that form. This is in reflection of thefirst possibility for the usage of context information,which would simply restrict the cohort to forms whosecategory could be consistent with the category requiredby the context. Effectively this means that only wordsthat could never appear in the given category areexcluded from consideration. In this cohort, the fre-quency associated with each word-form was the overallfrequency of that form across all category usages.

The second or usage-conditional set of measures wascalculated by both restricting the cohort to wordswhose possible categories were consistent with the cat-egory required by the context, and restricting the fre-quency to that of the category of usage. Therefore inthe phrase “the joke,” for example, the frequency contri-buting to the overall probability distribution of thecohort of “joke” is the frequency of “joke” as a noun.For both conditional measures of surprisal, the cohortof the immediately preceding phoneme was alsorestricted, unlike in the simplified depiction in Figure 1;we leave investigations of alternative formulations tofuture work.

We note that strictly speaking it would be possible touse unambiguous words rather than homonyms to testfor effects of conditional variables. The overall and cat-egory-specific frequencies of a category-unambiguousword are necessarily identical, while diverging signifi-cantly for most category-ambiguous words. Because thecalculation of entropy and surprisal for any givenphoneme takes into account the individual frequenciesof all cohort members, conditional entropy or surprisalvariables for unambiguous words would actually differslightly from their standard versions, though much lessso than would conditional variables for ambiguouswords. The choice to use category-ambiguous wordstherefore serves to differentiate standard and conditionalvariables asmuch as possible. The use of category-ambig-uous stems also makes it uniquely possible to comparethe effects of two different types of context (noun andverb) on the sameword-form. This may allow us to gener-alise beyond the effects of a single category, or to identifydifferences between the effects of noun and verb con-texts. Similar reasoning applies to using homonyms that

LANGUAGE, COGNITION AND NEUROSCIENCE 5

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

we presume to share a single semantic representationrather than those that have two unrelated meanings:this allows for the cleanest possible comparisonbetween our noun and verb conditions.

The pattern of effects of these two conditionalmeasures of cohort entropy and phoneme surprisalshould reflect (1) whether the cohort contains single ormultiple auditory word-forms in cases of category homo-nymy and (2) how and when syntactic context affectsthat cohort. In the next section we will describe the pat-terns predicted by the strongest competing hypothesesregarding these possibilities.

Predictions

Assuming that restriction of the cohort entails at leastconstraining its members on the basis of the categoryrequired by syntactic context, we predict, under the

single word-form view, significant effects of form-con-ditional cohort entropy and phoneme surprisal asforms with incompatible category designations areremoved from the cohort. Significant effects of usage-conditional cohort entropy and phoneme surprisal arepredicted to follow form-conditional effects, whenmember frequencies have subsequently been con-strained to the category of usage.

In themultipleword-forms view, however, restrictionofmembers and restriction of frequencies have indistin-guishable results. This is because the members of such acohort already have frequencies that reflect a single cat-egory of usage. Restricting to members with a compatiblecategory therefore results directly in a frequency-con-strained, category-compatible cohort, as is reflected byusage-conditional measures. A multiple word-formsview, therefore, predicts significant effects of usage-con-ditional cohort entropy and phoneme surprisal, but,

Figure 1. Visual representation of cohort restriction in syntactic context under single word-form and multiple word-forms hypothesesfor category homonymy. Red arrows mark the transition from the first to the second phoneme in an example cohort, in which memberforms no longer consistent with the input are excluded. Alternative states of the cohort given category-based restriction on memberforms and frequencies are depicted in the next two columns. Changes in total frequency, cohort entropy, and phoneme surprisal areindicated for each of the three restriction scenarios under the two hypotheses, and color-coded by result. For simplicity, the threepotential surprisal measures are shown with respect to a single preceding cohort, which is unrestricted.

6 P. GASTON AND A. MARANTZ

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

crucially, no form-conditional effects. This is because theform-conditional measures as we have calculated themreflect an intermediate state of the cohort (restricted bymembers, but not by frequency) that is impossible underthe multiple word-forms view.

In either case, significant effects of standard (orunconditioned, unconstrained) cohort entropy andphoneme surprisal may precede conditional effects,though their relative timing depends on the speedwith which integration of context information occurs. Itshould be noted that standard calculations of cohortentropy and phoneme surprisal, as used in previousstudies cited here, do assume a single word-form viewof the cohort in cases of more than one meaning aswell as in cases of more than one category. This isbecause of the manner in which the source corpora areorganised, with a single entry for each word-form. Ver-sions reflecting a multiple word-forms theory for differ-ent categories, like our usage-conditional measures,have therefore not previously been tested, and eventhese make an implicit assumption that single word-forms can be associated with multiple semanticrepresentations.

We present each word in both noun and verbcontext and include this as a factor in the analysisbecause of the possibility of influence of the specificcategory on the process of cohort restriction. Theseeffects are expected to occur in left superior and trans-verse temporal gyrus, the location of all of the pre-viously cited neural responses sensitive to entropy andsurprisal in auditorily presented words. Predictionsregarding their directionality and absolute timingdepend on the role of entropy and surprisal in proces-sing models. High entropy has been associated withboth inhibitory and facilitatory effects on processing,as measured behaviourally, and so far only with a facil-itatory effect on neural activity. Because the set ofresults that stand as precedent is extremely small, andbecause our hypothesis does not rest on directionality,we do not make a strong commitment in this regard.Entropy effects preceding the second phoneme havenot yet been reported.

While surprisal is generally considered to have inhibi-tory effects on processing, the expected timing of thoseeffects depends on whether or not surprisal is conceivedof as a signal of mismatch with acoustic predictionsmade in auditory cortex, which should begin roughly100 milliseconds (ms) after stimulus onset (Gagnepainet al., 2012). This is in opposition to the information-theory perspective that lower conditional probability(causing higher surprisal) input drives increased neuralactivity because of the information gain from allowinga greater number of possibilities to be ruled out. This

latter view would predict the occurrence of surprisaleffects later than 100 ms and perhaps further fromprimary auditory cortex. Though it is not entirely clearwhether entropy and surprisal effects can be found atthe first phoneme and therefore whether these measurescan shed light on what occurs at word onset, they shouldstill be able to provide insights into the effect of contexton the cohort that would be impossible to glean fromother measures.

Methods

Stimuli

The set of stimuli was selected by first taking the fullEnglish Lexicon Project (ELP) database (Balota et al.,2007) and restricting it to monomorphemic andmonosyl-labic words, so as to minimise contributions of morpho-logical or syllabic structure that were unrelated to thequestion at hand. This list was restricted to wordstagged in the SUBTLEX-US database (Brysbaert & New,2009) as both nouns and verbs but not as any other partof speech. SUBTLEX-US was used for this purpose ratherthan ELP because of its more sophisticated part-of-speech tagging algorithm (Brysbaert, New, & Keuleers,2012). Then, within this subset, we additionally restrictedour stimulus list to words that also appeared in theCELEX corpus (Baayen, Piepenbrock, & Gulikers, 1995), inorder to make use of its inflectional and derivationalrelationship information.

For each word, the number of senses on WordNet(http://wordnet.princeton.edu) was calculated. Wordswith more than two WordNet senses were eliminated.Next, words with unrelated homophones wereremoved from the candidate list. Finally, words wereinspected manually and removed if any noun and verbsenses were judged to be unrelated (e.g. “fling” (noun)and “fling” (verb), “mug” (noun) and “mug” (verb)). Thiswas done to ensure that polysemy was only present tothe extent of allowing noun and verb usages for asingle semantic concept, with no other variation.

This procedure resulted in 97 monomorphemic,monosyllabic noun/verb-ambiguous stems with nohomophony or polysemy beyond that inherent in cat-egory homonymy. These words were recorded in asound-proof room by a male native speaker of English,who read the stimuli from a list that did not indicate cat-egory. Recordings were normalised at 75 dB andresampled to 11.025 kHz using Praat software (Boersma& Weenink, 2014). Phoneme boundaries were annotatedusing the Penn Forced Aligner (Yuan & Liberman, 2008),whose output was adjusted manually based on visualand auditory inspection.

LANGUAGE, COGNITION AND NEUROSCIENCE 7

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

Task

Stems were presented in three-word phrases, such as “togleam brightly” or “the clash persisted,” which werespliced together from individually recorded tokens andcomprised three conditions: noun context, verb context,and ambiguous context. In the noun and verb contexts,the first word in the phrase was “the” or “to,” such thatthe context word and the stem together formed a nounphrase (NP) or an infinitival verb phrase (VP). The firsttwo words always formed an acceptable NP or VP, butthe third word might or might not form an acceptablephrase in combination with the NP or VP, as in examples(1)–(4) below, where * indicates an unacceptable phrase.

(1) the clash persisted(2) *the frown darkly(3) to gleam brightly(4) *to bribe amount

Theexperimental taskwas to respondas towhether thethird word formed an acceptable phrase in combinationwith the first two words. Participants were told that theirresponse should reflect the acceptability of the phrase asa self-contained unit. In example (2) above, “the frowndarkly” could plausibly continue in a sentence like “Thefrown darkly portended events to come,” but is unaccep-table in our task because it is not an acceptable phraseby itself. In the ambiguous condition, the first word ofthe phrase was instead the nonword “juh.” Participantswere instructed to ignore the nonword when it occurredand therefore consider simply the combination of thesecondword and the third word in their phrase judgment,as in the examples below:

(5) juh ache prone(6) *juh gulp young

The context words (“to,” “the,” and “juh”) were con-structed from a single vowel token cross-spliced inPraat (Boersma & Weenink, 2014) with the appropriateonset, so that the auditory contexts preceding eachtarget would be maximally informative in terms of cat-egory but minimally different in terms of acoustics. The“to” token was therefore pronounced “tuh,” as inspeeded pronunciation of verb phrases (see Tanenhausand Donnenwerth-Nolan (1984), who employed thesame strategy so that infinitival “to” would not be homo-phonous with “two”). Because the resulting three tokenswere extremely similar and not easy to distinguishwithout paying close attention, the third word of eachphrase was selected so that a correct acceptabilityresponse from a participant meant that only the

intended context could possibly have been interpreted.For example, in the phrase “to ache chart,” correctly indi-cating that the phrase is unacceptable would mean thatonly an interpretation of “to ache” + “chart” could haveoccurred, because an interpretation of “the ache”+ “chart” or “juh ache” + “chart” would lead the partici-pant to accept the phrase.

This stipulation required that in the noun and verbconditions all target stems were embedded in phrasesthat were unacceptable, and all stems in the ambiguouscondition were embedded in acceptable phrases. Theasymmetry is in principle irrelevant because the responseto the third word is not analyzed, and the stem occursbefore the acceptable or unacceptable third word.However, for a completely balanced design, and sothat the task could not be completed without fullyparsing the phrases, we also included filler phrases inthe noun and verb conditions that were acceptableand phrases in the ambiguous condition that were unac-ceptable. These were not included for analysis becausethe participant response would not allow confirmationthat the correct interpretation had taken place. This isbecause, for example, “X ache badly” would be accepta-ble whether “X” was perceived as “to” or as “juh.”

Furthermore, we included a variety of other fillers toinsure balance of a variety of other features, includingcategory ambiguity, syllable structure, and morphemestructure. Therefore in each condition we included cat-egory-unambiguous but monomorphemic and monosyl-labic stems, and both category-ambiguous and category-unambiguous monomorphemic, bisyllabic stems. Inaddition, both category-ambiguous and category-unam-biguous, and both monosyllabic and bisyllabic, stemswere presented followed by an affix, such that partici-pants could not assume that all targets were monomor-phemic. This was done for the noun and ambiguousconditions, in both acceptable and unacceptablephrases, but bimorphemic words could not be presentedfelicitously in infinitival verb phrases.

Each target stem was presented four times: in thenoun, verb, and ambiguous conditions, and in one ofthe various filler conditions. Fillers were either part ofan acceptable phrase for the noun and verb conditionsor an unacceptable phrase for the ambiguous condition,or were followed by an affix in the bimorphemic fillercondition.

To avoid effects of order of presentation, a LatinSquare design was employed for the lists. Stimuli weredivided into four blocks, and each word appeared onlyonce in each block. The order within the blocks was ran-domised. A unique list was then created for each possibleorder of the four blocks (24 in total). Each subject saw adifferent list, but all items were seen by all subjects.

8 P. GASTON AND A. MARANTZ

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

Continuous variables

Phoneme surprisal and cohort entropy variables werecalculated following the procedure described in Ettingeret al. (2014) (see also Figure 1). The conditional phonemesurprisal and cohort entropy variables were created bymanipulating the cohort used in these equations. Forthe form-conditional measures, the cohort from whichthe probability distributions were calculated (theSUBTLEX-US database (Brysbaert & New, 2009), via theEnglish Lexicon Project (Balota et al., 2007)) wasrestricted to entries whose list of possible categories(per SUBTLEX-US) included the category of presentation.The frequency values for these entries were the overallcorpus frequency value, across all category usages. Forthe usage-conditional measures, the cohort wasrestricted as described above, but the frequency wasalso restricted so that the frequency used for the prob-ability distribution was the frequency of usage withinthe category of presentation.

Standard, form-conditional, and usage-conditionalcohort entropy and phoneme surprisal values were cal-culated for each phoneme in each word. Regressiontests time-locked to a given phoneme boundary, asdescribed in the Analysis section of this paper, thereforeused the predictor value relevant to that specificphoneme. Correlation tables for the relevant variablesat each phoneme are included in the supplementarymaterials.

Participants

Twenty-four right-handed (assessed using the EdinburghHandedness Inventory; Oldfield, 1971) native Englishspeakers with normal vision and hearing were recruitedfrom the New York University (NYU) Abu Dhabi commu-nity, and gave informed consent before participating.Seven datasets were discarded because of strong visualresponses in the baseline period, attributed to a ten-dency to blink immediately following the fixation crossthat initiated each trial. Because baseline correctionwould have been problematic for these datasets, anadditional seven participants (total N = 31) wererecruited in order to complete the Latin Square listdesign. Of the 24 participants used in the analysis, 13were female, the age range was 18–51 years, and themedian age was 20.5 years. Only one participant’s ageexceeded 40.

Procedure

The experiment took place in the Neuroscience ofLanguage Lab at NYU Abu Dhabi. Before beginning the

experiment participants completed a practice sessionof 38 items and were given an opportunity to ask ques-tions about the task. Participant head shapes were digi-tised using a Polhemus FastSCAN laser scanner(Polhemus, VT, USA) for later use in coregistration withthe Freesurfer (http://surfer.nmr.mgh.harvard.edu/)average brain. Locations of fiducial points (nasion, leftand right tragus), three marker points on the forehead,and two pre-auricular marker points were also recordedby FastSCAN, and once the participant was situated inthe magnetically shielded room (MSR), marker coilswere attached at the pre-auricular and forehead pointsso that the location and orientation of the head relativeto the MEG sensors could be recorded pre- and post-experiment.

Neural data were recorded continuously using a 208-channel axial gradiometer MEG system (Kanazawa Insti-tute of Technology, Kanazawa, Japan) with a 1000 Hzsampling rate and an on-line low-pass filter of 200 Hz.

Stimuli were presented binaurally through tube ear-phones using PsychToolBox software (Brainard, 1997;Pelli, 1997). In each trial, a fixation cross appeared for300 ms, followed by a 300 ms blank screen before thefirst word in the phrase (“to,” “the,” or “juh”) began toplay. The rest of the phrase immediately followed.After a 150 ms pause, a question mark appeared onthe screen to cue the participant to indicate the accept-ability of the phrase (pressing a button with the leftmiddle finger to answer “no” and the left index fingerto answer “yes”), and was present until a responsewas received. This was followed by a 300 ms offsetand then a 500 ms fixation cross for feedback: pre-sented in white if the response was correct and red ifthe response was incorrect. The inter-trial interval was600 ms. There were a total of 524 trials, divided intofour blocks of 131 items, with a break between eachblock. The length of the break was determined by theparticipant.

Data processing

MEG160 software (Yokogawa Electric Corporation andEagle Technology Corporation, Tokyo, Japan) wasused to noise-reduce the raw data with the Continu-ously Adjusted Least Squares Method (CALM; Adachi,Shimogawara, Higuchi, Haruta, & Ochiai, 2001), basedon information from eight magnetometer referencechannels. Eelbrain (Brodbeck, 2017), a Pythonpackage for statistical analysis of MEG and EEG (elec-troencephalography) data, was then used to processand analyze the data. Eelbrain includes a Pythonclass, MneExperiment, which manages data processingand analysis with functions from MNE-Python

LANGUAGE, COGNITION AND NEUROSCIENCE 9

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

(Gramfort et al., 2013, 2014) MEG/EEG data processingsoftware.1 The Freesurfer average brain was first scaledto fit the participant head shape, and then co-regis-tered with the head shape and marker measurementsby aligning fiducial points, minimising distancebetween the scalp and head shape with an IterativeClosest Points algorithm, and making manualadjustments.

A low-pass filter of 40 Hz was applied to the con-tinuous data. Epochs were then time-locked from theonset of the target stems to 1000 ms post-target-onset, and the baseline period was defined as theperiod 507 to 407 ms pre-target-onset to allow forbaseline correction using the period before the firstword in the phrase. Trials were excluded if theyexceeded an absolute threshold of ±2.0 picoteslaover this −507 to 1000 ms time window. Trials con-taminated by blinks and other motion artifacts wereremoved manually. Trials for which the acceptabilityresponse was incorrect were also removed from analy-sis, as well as a small number of trials for which theonset of the target word following the context wordwas delayed because of an error in the stimulus pres-entation software. In total, the average number oftrials retained for analysis (out of 97 possible for eachcondition) was 65.8 for the noun condition, 68 forthe verb condition, and 59.7 for the ambiguous con-dition. Pairwise t-tests showed that the number oftrials in the noun and verb conditions did not signifi-cantly differ (t(23) = −0.943, p = .356), but the numberof trials in the ambiguous condition was significantlydifferent from the noun (t(23) = 2.33, p = .029) andverb (t(23) = 3.59, p = .002) conditions.

A noise covariance matrix was estimated using thebaseline period, and then regularised. MNE (Gramfortet al., 2014) software’s cortically-constrained minimumnorm estimation was used to create an ico-4 sourcespace containing 5124 sources. The Boundary ElementModel (BEM) was used to compute the forward solutionat each source, and this estimated activation was con-verted to normalised Dynamic Statistical ParametricMap (dSPM; Dale et al., 2000) units with a signal-to-noise ratio of two. From the forward solution, noisecovariance matrix, and a grand average of all trials, theinverse solution was created using a fixed orientationof the dipole current, and applied to each trial for eachsource and time point. Using a fixed orientation con-straint for source estimation means that the dipole is cor-tically constrained, and the current estimates cantherefore be positive (directed outward) or negative(directed inward). See Gwilliams, Lewis, and Marantz(2016) for further discussion of the relative merits ofusing a fixed orientation constraint.

Analysis

Anatomical regions from the Freesurfer Desikan-KillianyAtlas (Desikan et al., 2006) were used to define regionsfor analysis. The left superior temporal gyrus (STG) andtransverse temporal gyrus (TTG) were chosen per thefindings of Gagnepain et al. (2012) and Ettinger et al.(2014).

Phoneme boundary timing information extracted bythe Penn Forced Aligner (Yuan & Liberman, 2008) wasimported and used to mark the onsets of the second,third, and final phoneme for each word, so that responseepochs for analysis could be time-locked to phonemeboundaries despite each word having a differentlength. The mean boundary for the second phonemewas at 136.94 ms (SD 50.34 ms), for the third phonemewas at 257.96 ms (SD 75.24 ms), and for the finalphoneme was at 347.95 ms (SD 79.17 ms). For slightlyless than half of the target stems, the vowel was notthe second phoneme.

Analyses were conducted in time windows followingthe first, second, third, and final phonemes, indepen-dently, using the relevant predictor values for thatspecific phoneme. This was done because cohortentropy and phoneme surprisal values fluctuate overthe course of a word, and the single word-form hypoth-esis allows for the possibility of temporally distinct,incremental effects of the conditional variables. Thisanalysis required a choice between time-locking fromimportant, especially informative constituents, like theonset, vowel, and offset, or considering a singleresponse from each individual time point, at a lag, aswas done by Ettinger et al. (2014). We consider theapproach described here (time-locking from four differ-ent phoneme boundaries) to be a compromise, in that ittracks the temporal progression of our continuousmeasures as the word unfolds but also allows for theimportance of and variation among phonemic constitu-ents in a set of stimuli whose phoneme boundariesnaturally vary.

Ettinger et al. (2014) calculated entropy and surpri-sal values corresponding to the current segment ateach millisecond, and then, for each millisecond’sentropy or surprisal variable, tested for correlationwith neural activity at a point 200 ms later. The earliesteffect of cohort entropy found was for entropy calcu-lated at 135 ms after word onset (which correspondsto the mean onset of the second phoneme in ourstimuli), tested for correlation with neural activity200 ms later, or 335 ms after word onset. Phoneme sur-prisal effects were not found until the end of the word.We therefore use time windows 150–450 ms followingthe onset of each phoneme: the span from the first

10 P. GASTON AND A. MARANTZ

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

through the final phoneme window therefore safelyencompasses Ettinger et al.’s (2014) lag of 200 ms,and allows for the possibility of effects stretchingpast word offset.

We used a two-step analysis process for eachphoneme boundary, involving, first, ordinary leastsquares (OLS) regression on neural activity from alltrials at each source/time point, and then permutationcluster testing on the partial regression coefficients.Thus, for every subject, a beta value was computed foreach predictor in a model at each time/source point ofneural activity.

Noun and verb targets were assessed as a singlegroup, and at each phoneme the regression modelincluded the (centred) standard, form-conditional, andusage-conditional versions of the relevant variable(cohort entropy or phoneme surprisal), as well as aneffects-coded binary context factor (noun or verb) andits interactions with the three continuous variables.The cohort entropy and phoneme surprisal variableswere first evaluated in two separate models at eachphoneme due to computational limitations at the timeof the analysis; as described in the Follow-up analysessection, unified models containing both the entropyand the surprisal variables were evaluated when thisbecame possible. Ambiguous targets were assessedseparately from noun and verb targets, with regressionmodels containing only the standard variables (sincethe conditional variables were not expected to berelevant).

For each predictor, the beta values computed for eachsource and time point in each trial were then used as theraw data for spatiotemporal cluster tests, conducted asfollows.

In windows time-locked 150–450 ms from the onset ofthe phoneme, in a single, merged region of interest con-sisting of left STG and TTG, one-sample, repeatedmeasures t-tests were performed on the beta values fora given predictor, using permutation cluster testing tofind spatiotemporal clusters of significance (see Holmesand Friston (1998) on accounting for subject as arandom effect in this manner). More specifically, foreach cluster of source/time points with t-values equival-ent to uncorrected p < .05, summed t-values were evalu-ated against a distribution of the summed t-values ofthe largest clusters found in each of 10,000 permutationsof the original data. In this way we corrected for multiplecomparisons across space and time (Maris & Oostenveld,2007).

Form vs usage-conditional effectsTo address the concern that this analysis techniquecan indicate only the existence of form or usage-con-ditional effects, and not that one is stronger than theother at any given phoneme (as is important for ourhypothesis), we also performed follow-up t-tests onthe averaged beta values for form and usage-con-ditional variables within the time and source pointsof clusters that were found for either type of variable.For all of these tests, when a cluster for a form orusage-conditional effect was found, the null hypoth-esis that there was no difference between the betavalues for the form and usage-conditional measureswithin the source and time points of that clustercould be rejected with p < .01. The results of thesetests are indicated in the last columns of Tables 1–5,when applicable.

Table 1. Phoneme surprisal effects at second phoneme.Direction of: Comparison with:

Variable Time range Location neural activity correlation usage-conditional

form-conditional 215–450 ms STG negative positive (N), negative (V) t(23) =−5.25, p < 0.001standard 235–450 ms STG negative negativeform-conditional 360–450 ms TTG/STG negative positive (N), negative (V) t(23) =−3.53, p = 0.002

Notes: Summary of significant clusters (p < 0.05) from phoneme surprisal model. All locations noted are in the left hemisphere. Main effects are indicated with asingle direction of correlation; interactions with context indicate the direction of correlation in the noun context condition (N) and verb context condition (V)separately.

Table 2. Phoneme surprisal effects at third phoneme.Direction of: Comparison with:

Variable Time range Location neural activity correlation usage-conditional

form-conditional 195–385 ms TTG/STG negative negative (N), positive (V) t(23) = 3.82, p = 0.001standard 205–385 ms TTG/STG negative positive (N), negative (V)form-conditional 320–450 ms STG negative negative (N), positive (V) t(23) = 3.66, p = 0.001

Notes: Summary of significant clusters (p < 0.05) from phoneme surprisal model. All locations noted are in the left hemisphere. Main effects are indicated with asingle direction of correlation; interactions with context indicate the direction of correlation in the noun context condition (N) and verb context condition (V)separately.

LANGUAGE, COGNITION AND NEUROSCIENCE 11

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

Results

Behavioural data

Mean accuracy, across all conditions and subjects, was85.05%. Accuracy was 88.06% in the noun phrase con-dition, 87.8% in the verb phrase condition, and 79.3%in the ambiguous phrase condition. Mean reaction timewas 1.477 seconds (1.463 for the noun phrase condition,1.499 for the verb phrase condition, and 1.471 for theambiguous phrase condition). Pairwise t-tests revealedthat accuracy rates in the noun and verb conditionswere not significantly different from each other(t(23) = .108, p = .915), but that accuracy in the ambigu-ous condition differed significantly from both the noun(t(23) =−3.26, p = .003) and the verb (t(23) =−3.43,p = .002) conditions. Reaction time did not differ signifi-cantly between conditions.

We attribute the difference in accuracy rates betweenthe noun/verb and ambiguous conditions to the diffi-culty of interpreting the “juh” token, but do not takethis to have any implication for the analysis because

the ambiguous and noun/verb conditions are neverdirectly compared. Because the task was used only toensure attention, we did not analyze the behaviouraldata in any further depth.

Neural data

No effects of phoneme surprisal or cohort entropy werefound in the ambiguous context condition. The rest ofthis section describes results from the noun and verbcontext conditions. For each significant cluster foundwe include: (1) a plot of the time course of betavalues for the given predictor (calculated separatelyfor each condition in the case of interactions), averagedover all sources in the cluster, (2) maps of summed t-values at each source point in the cluster, indicatingthe spatial extent of the cluster and the strength ofthe contribution from each source, and (3) a plot ofthe time course of neural activity, averaged over allsources in the cluster, and separated by condition.Time course plots are shaded in the time range of

Table 3. Phoneme surprisal effects at final phoneme.Direction of: Comparison with:

Variable Time range Location neural activity correlation form-conditional

usage-conditional 170–335 ms TTG/STG negative positive (N), negative (V) t(23) = 2.91, p = 0.008usage-conditional 180–400 ms TTG/STG positive negative t(23) = 2.89, p = 0.008usage-conditional 180–450 ms STG negative positive t(23) =−3.82, p = 0.001

Notes: Summary of significant clusters (p < 0.05) from phoneme surprisal model. All locations noted are in the left hemisphere. Main effects are indicated with asingle direction of correlation; interactions with context indicate the direction of correlation in the noun context condition (N) and verb context condition (V)separately.

Table 4. Phoneme surprisal effects from unified model.Direction of: Comparison with:

Phoneme Variable Time range Location neural activity correlation usage-conditional

second form-conditional 340–450 ms TTG/STG negative positive (N), negative (V) t(23) =−3.22, p = 0.004third form-conditional 150–450 ms TTG/STG negative negative (N), positive (V) t(23) = 5.24, p < 0.001third standard 320–450 ms TTG/STG negative positive (N), negative (V)final usage-conditional 190–370 ms STG negative positive t(23) =−3.04, p = 0.006

Notes: Summary of significant clusters (p < 0.05) from unified surprisal and entropy models. No entropy effects were found, so all effects listed are for phonemesurprisal. All locations noted are in the left hemisphere. Main effects are indicated with a single direction of correlation; interactions with context indicate thedirection of correlation in the noun context condition (N) and verb context condition (V) separately.

Table 5. Phoneme surprisal effects from unified model in broader language areas.Direction of: Comparison with:

Phoneme Variable Time range Location neural activity correlation usage-conditional

second form-conditional 150–370 ms LATL positive negative (N), positive (V) t(23) = 3.70, p = 0.001second form-conditional 230–325 ms IFG positive negative (N), positive (V) t(23) = 2.85, p = 0.009second standard 275–450 ms IFG positive positivesecond form-conditional 340–450 ms IFG positive negative (N), positive (V) t(23) = 2.87, p = 0.009third form-conditional 150–450 ms TTG/STG negative negative (N), positive (V) t(23) = 5.24, p < 0.001third form-conditional 190–450 ms STG positive positive (N), negative (V) t(23) =−5.52, p < 0.001third form-conditional 270–450 ms IFG positive positive (N), negative (V) t(23) =−3.61, p = 0.001Third standard 320–450 ms TTG/STG negative positive (N), negative (V)

Notes: Summary of significant (p < 0.05) clusters from unified surprisal and entropy models, for tests conducted in broader set of language-related areas. Noentropy effects were found, so all effects listed are for phoneme surprisal. Italicised entries indicate clusters that did not survive correction over bothlabels. All locations noted are in the left hemisphere. Main effects are indicated with a single direction of correlation; interactions with context indicate thedirection of correlation in the noun context condition (N) and verb context condition (V) separately.

12 P. GASTON AND A. MARANTZ

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

the cluster. Figure 2 shows phoneme surprisal clustersfollowing the second phoneme onset, Figure 3 showsphoneme surprisal clusters following the thirdphoneme onset, and Figure 4 shows phonemesurprisal clusters following the final phoneme onset.Tables 1–3 provide summaries and additional detailsabout these results.

First phonemeNo significant clusters of correlation were foundbetween neural activity and any of the cohort entropyor phoneme surprisal variables in the 150–450 mswindow following the first phoneme onset.

Second phonemeTwo significant clusters indicating an interactionbetween form-conditional phoneme surprisal andcontext were found for noun/verb targets in a timewindow 215–450 ms after the onset of the secondphoneme in STG (p = .007) and 360–450 ms after the

onset of the second phoneme in TTG/STG (p = .048)(Figure 2). Nouns showed a positive correlation ofform-conditional phoneme surprisal with negativeneural activity (a facilitatory effect), while verbs showeda weakly negative correlation of form-conditionalphoneme surprisal with negative neural activity (aninhibitory effect).

A significant (p = .028) cluster indicating a maineffect of standard phoneme surprisal was found fornoun/verb targets in a time window 235–450 ms afterthe onset of the second phoneme in STG (Figure 2).The cluster showed a negative correlation of standardphoneme surprisal with negative neural activity (aninhibitory effect).

We note that, in addition, a significant (p = .012)cluster indicating an interaction between form-con-ditional cohort entropy and context was found fornoun/verb targets in a time window 335–450 ms afterthe onset of the second phoneme in STG (not pictured).Nouns showed a positive correlation of form-

Figure 2. Phoneme surprisal effects at second phoneme. Significant spatiotemporal clusters for phoneme surprisal from permutationcluster tests on partial regression coefficients computed at each source/time point in a merged left STG/TTG ROI, in the period 150–450 ms after phoneme onset. Time course plots are shaded in the time range of the cluster. Correlation time course plots (left) are time-locked to the phoneme boundary and show beta values averaged over all sources in the cluster. In the case of interaction effects, thelegend indicates the context (noun or verb phrase) in which the word appeared and plotted beta values are calculated for the twoconditions individually, rather than for the interaction. For main effects, the two conditions are plotted together. Brain plots(middle) show summed t-values at each cluster source point. Source time course plots (right) show neural activity (in Dynamic StatisticalParametric Map units (dSPM)) averaged over all sources in the cluster, separated by condition.

LANGUAGE, COGNITION AND NEUROSCIENCE 13

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

conditional cohort entropy with negative neural activity(a facilitatory effect), while verbs showed a weaklynegative correlation of form-conditional cohortentropy with negative neural activity (an inhibitoryeffect). This was the only cohort entropy effect thatappeared in our data, and it was not significant whenthe entropy and surprisal variables were evaluated ina unified model (see Follow-up analyses).

Third phonemeTwo spatially distinct significant (p = .011, p = .037)clusters were found indicating interactions betweenform-conditional phoneme surprisal and context fornoun/verb targets in a time window 195–385 msafter the onset of the third phoneme in TTG/STG,and 320–450 ms after the onset of the thirdphoneme in STG (Figure 3). In both clusters, nounsshowed a negative correlation of form-conditionalphoneme surprisal with negative neural activity (an

inhibitory effect), while verbs showed a weakly posi-tive correlation of form-conditional phonemesurprisal with negative neural activity (a facilitatoryeffect).

A significant (p = .033) cluster indicating an inter-action between standard phoneme surprisal andcontext for noun/verb targets in a time window 205–385 ms after the onset of the third phoneme wasfound in TTG/STG (Figure 3). Nouns showed a positivecorrelation of standard phoneme surprisal with nega-tive neural activity (a facilitatory effect), while verbsshowed a weakly negative correlation of standardphoneme surprisal with negative neural activity (aninhibitory effect).

Final phonemeTwo spatially distinct significant (p = .028, p = .004) clus-ters indicating main effects of usage-conditionalphoneme surprisal were found for noun/verb targets in

Figure 3. Phoneme surprisal effects at third phoneme. Significant spatiotemporal clusters for phoneme surprisal from permutationcluster tests on partial regression coefficients computed at each source/time point in a merged left STG/TTG ROI, in the period150–450 ms after phoneme onset. Time course plots are shaded in the time range of the cluster. Correlation time course plots(left) are time-locked to the phoneme boundary and show beta values averaged over all sources in the cluster. As all of these clustersare for interaction effects, the legend indicates the context (noun or verb phrase) in which the word appeared and plotted beta valuesare calculated for the two conditions individually, rather than for the interaction. Brain plots (middle) show summed t-values at eachcluster source point. Source time course plots (right) show neural activity (in Dynamic Statistical Parametric Map units (dSPM)) averagedover all sources in the cluster, separated by condition.

14 P. GASTON AND A. MARANTZ

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

a time window 180–400 ms after the onset of the finalphoneme in TTG/STG and 180–450 ms after the onsetof the final phoneme in STG (Figure 4). The cluster inSTG showed a positive correlation of usage-conditionalphoneme surprisal with negative neural activity (a facili-tatory effect). The cluster in TTG/STG showed a negativecorrelation of usage-conditional phoneme surprisal withpositive neural activity (also a facilitatory effect).

An additional significant (p = .045) cluster indicatingan interaction between usage-conditional phoneme sur-prisal and context was found for noun/verb targets in atime window 170–355 ms after the onset of the finalphoneme in TTG/STG (Figure 4). Nouns showed a posi-tive correlation of usage-conditional phoneme surprisalwith negative neural activity (a facilitatory effect), whileverbs showed a weakly negative correlation of usage-conditional phoneme surprisal with negative neuralactivity (an inhibitory effect).

Follow-up analyses

Unified model for entropy and surprisalIt eventually became possible, after our primary analysiswas completed, to evaluate a unified model containingall of the cohort entropy and phoneme surprisal variablesfor a given phoneme, rather then testing entropy andsurprisal variables in separate models. Using theseunified models, we found no effects of cohort entropy,or phoneme surprisal at the first phoneme. At thesecond phoneme, we found a significant cluster(p = 0.039) indicating an interaction between form-con-ditional phoneme surprisal and context in a window340–450 ms after phoneme onset in TTG/STG. Nounsshowed a positive correlation of form-conditionalphoneme surprisal with negative neural activity (a facili-tatory effect), while verbs showed a weakly negative cor-relation of form-conditional phoneme surprisal withnegative neural activity (an inhibitory effect).

Figure 4. Phoneme surprisal effects at final phoneme. Significant spatiotemporal clusters for phoneme surprisal from permutationcluster tests on partial regression coefficients computed at each source/time point in a merged left STG/TTG ROI, in the period150–450 ms after phoneme onset. Time course plots are shaded in the time range of the cluster. Correlation time course plots(left) are time-locked to the phoneme boundary and show beta values averaged over all sources in the cluster. Brain plots (middle)show summed t-values at each cluster source point. Source time course plots (right) show neural activity (in Dynamic Statistical Para-metric Map units (dSPM)) averaged over all sources in the cluster, separated by the condition (noun or verb phrase) in which the wordappeared.

LANGUAGE, COGNITION AND NEUROSCIENCE 15

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

At the third phoneme, spanning STG and TTG, wefound another significant cluster (p < 0.001) indicatingan interaction between form-conditional phoneme sur-prisal and context throughout the full analysis window(150–450 ms). Nouns showed a negative correlation ofform-conditional phoneme surprisal with negativeneural activity (an inhibitory effect), while verbsshowed a weakly positive correlation of form-conditionalphoneme surprisal with negative neural activity (afacilitatory effect). We also found a significant cluster(p = 0.012) indicating an interaction between standardphoneme surprisal and context in a window320–450 ms after the onset of the third phoneme.

Nouns showed a positive correlation of standardphoneme surprisal with negative neural activity (a facili-tatory effect), while verbs showed a weakly negative cor-relation of standard phoneme surprisal with negativeneural activity (an inhibitory effect).

Finally, at the final phoneme, we found a significantcluster (p = 0.029) indicating a main effect of usage-con-ditional phoneme surprisal in a window 190–370 msafter phoneme onset in STG. Both nouns and verbsshowed a positive correlation of usage-conditionalphoneme surprisal with negative neural activity (a facili-tatory effect). These results are listed in Table 4 and illus-trated in Figure 5.

Figure 5. Unified model. Significant spatiotemporal clusters from permutation cluster tests on partial regression coefficients ofphoneme surprisal and cohort entropy variables, computed at each source/time point in a merged left STG/TTG ROI, in the period150–450 ms after each phoneme onset. Correlation time course plots (left) are time-locked to the phoneme boundary and showbeta values averaged over all sources in the cluster. In the case of interaction effects, the legend indicates the context (noun orverb phrase) in which the word appeared and plotted beta values are calculated for the two conditions individually, rather than forthe interaction. For main effects, the two conditions are plotted together. Brain plots (middle) show summed t-values at eachcluster source point. Source time course plots (right) show neural activity (in Dynamic Statistical Parametric Map units (dSPM)) averagedover all sources in the cluster, separated by the condition (noun or verb phrase) in which the word appeared.

16 P. GASTON AND A. MARANTZ

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

Left-hemisphere language areasFinally, we conducted an exploratory analysis of abroader set of left-hemisphere, language related brainareas to determine whether the observed effectsextended beyond STG and TTG, using the unifiedmodel including both entropy and surprisal variables.The extended ROI included one label for left inferiorfrontal gyrus (IFG) and a second label for the left tem-poral and inferior parietal lobe. The results of this analysisare listed in Table 5 and illustrated in Figure 6. No signifi-cant clusters were found at the first or final phoneme.

At the second phoneme, a cluster indicating an inter-action of form-conditional surprisal with context was sig-nificant (p = 0.0283, corrected over both labels) in thewindow 150–370 ms after phoneme onset, in leftanterior temporal lobe (LATL). An additional three clus-ters, two indicating interactions with form-conditionalsurprisal (230–325 ms; 340–450 ms) and one indicatinga main effect of standard surprisal (275–450 ms) weresignificant within IFG (p = 0.045; p = 0.041; p = 0.043)but did not survive correction over both labels.

At the thirdphoneme, two significant clusters indicatinginteractions with form-conditional surprisal were found inTTG/STG and STG, in the windows 150–450 ms (p < 0.001,corrected over both labels) and 190–450 ms (p = 0.0183,corrected over both labels) after phoneme onset. A thirdcluster indicating an interaction of standard phoneme sur-prisal with context was also significant (p = 0.0457, cor-rected over both labels) in TTG/STG, 320–450 ms afterphoneme onset. As occurred at the second phoneme, acluster indicating an interaction with form-conditional sur-prisal (270–450 ms) was significant within IFG (p = 0.007)but did not survive correction over both labels.

Discussion

This experiment aimed to clarify the nature and timecourse of cohort restriction in such a way as to also dis-tinguish between hypotheses of single and multipleword-forms for category homonymy in the auditorycohort. We computed form-conditional and usage-con-ditional measures of cohort entropy and phoneme

Figure 6. Unified model in broader language areas. Significant spatiotemporal clusters from permutation cluster tests on partialregression coefficients of phoneme surprisal and cohort entropy variables, computed at each source/time point across left hemispherelanguage-related ROI’s (IFG and temporal/inferior parietal lobe), in the period 150–450 ms after each phoneme onset. Italicised entriesindicate clusters that did not survive correction over both labels. Plots show summed t-values at each cluster source point.

LANGUAGE, COGNITION AND NEUROSCIENCE 17

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

surprisal that reflected restriction of forms included inthe cohort or restriction of usage probability on thebasis of category-specific frequency, respectively.Cohorts with single as opposed to multiple word-formsfor category homonyms like “ache” were expected totake shape differently in response to these two restric-tion mechanisms. For each phoneme, we tested for cor-relation between neural activity and standard, form-conditional, and usage-conditional entropy and surprisalvariables and their interactions with the phrasal context.Entropy and surprisal variables were first assessed in sep-arate models, and then later, when it became computa-tionally feasible, in a single unified model at eachphoneme. The results from the unified models consti-tuted essentially a subset of the results from the separatemodels but yielded the same overall conclusion withrespect to our hypotheses; we will focus here on thatsmaller set of results from the unified models.

Primary findings

Our data showed significant interactions of context withform-conditional phoneme surprisal in TTG and STG, start-ing 340 ms after the second phoneme onset and 150 msafter the thirdphonemeonset. This provides clear evidenceagainst a multiple word-forms hypothesis for categoryhomonymy: form-conditional measures reflect a cohortwhose membership has been restricted by grammaticalcategory but whose member frequencies are for generalrather than category-specific usage. A cohort withmultipleword-forms for category homonyms by definition can onlybe comprisedofmemberswhose frequencies are for usagewithin the grammatical category, since each categoryusage has its own word-form. Form-conditional measurestherefore reflect a probability distribution that is not poss-ible for a cohort representing category homonyms withseparateword-forms, and shouldnot be significantly corre-lated with neural activity during auditory processing if thecohort is in fact organised in this way.

Our results subsequently show an effect of usage-con-ditional phoneme surprisal starting 190 ms after the onsetof the final phoneme, in STG. They therefore providesupport for a cohort that is constrained in two steps: firstrestricted to word-forms that could possibly appear inthe category required by the context, and then weightedby usage-specific probabilities of those word-forms.

We had expected any effects of standard phonemesurprisal or cohort entropy to precede those of con-ditional measures, but in fact we found the only standardphoneme surprisal effect in our data (again, an inter-action) starting 320 ms after the onset of the thirdphoneme, following form-conditional effects for thesecond and third phonemes. This suggests that our

first hypothesised step of cohort constraint, in whichforms under consideration are restricted to those withcontext-compatible category designations, is more accu-rately characterised as a heightened sensitivity to asubset of the cohort. This would be in contrast to theremoval of possibilities that do not fit the context. Boost-ing good fits, rather than excluding bad fits, could meanthat measures reflecting whole-cohort distributions donot yet cease to be relevant (because in some sensethe whole cohort is still intact).

Evidence from Strand et al. (2014) that a category-specific cohort plays a role in word recognition isfurther substantiated here by effects of both form-con-ditional and usage-conditional variables. Evidence fromeye-tracking that constraints on grammatical classimmediately exclude competitors that do not fit thecontext (most relevantly, from Dahan et al. (2000) andMagnuson et al. (2008)) does not find clear supporthere, though neither is there necessarily a conflict. Thefirst conditional effect in our data from the unifiedmodels occurs 340 ms after the onset of the secondphoneme (roughly 490 ms into the word) in TTG/STG,our original ROI, and 150 ms after the onset of thesecond phoneme (roughly 300 ms into the word) inLATL. Cross-model priming and gating tend to showthat competitors are not constrained by context (or atleast not completely so) before 200 ms. We do not,however, find effects of any cohort variables, constrainedor not, at the first phoneme, and our first unconstrainedeffect follows rather than precedes the first conditionaleffect. It is therefore not possible to say whether ourresults indicate the lower bound on the timing of contex-tual constraints in this paradigm, or simply the lowerbound on the timing of measurable effects of surprisal.Further research using such measures will be necessaryto strengthen the expectations we can have regardingwhen such effects should occur, and how they comp-lement the results of other methodologies.

Secondary considerations

A variety of additional considerations also arise from ourdata. The first is the lack of entropy effects whenphoneme surprisal and cohort entropy are analyzed ina single model. Our hypothesis did not rest on whetherform-conditional effects manifested via surprisal orentropy, and the experiment as a whole was notintended to distinguish whether cohort entropy orphoneme surprisal better characterises the response toincoming phonemes. Nevertheless, our finding herethat it is overwhelmingly phoneme surprisal, not cohortentropy, that significantly correlates with neural activityis an interesting pattern for future research.

18 P. GASTON AND A. MARANTZ

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

The majority of the effects observed were interactionswith context, and a clear generalisation is that, while cor-relations exhibited by verbs tended to be in the oppositedirection of those exhibited by nouns, they were alsomuch weaker. One question this might raise is whetherthe way in which we calculated our conditional measureswas somehow less appropriate for the verb conditionthan for the noun condition, which would be informativefor our understanding of the specific mechanisms ofcohort restriction. This could make sense in light of thefact that the cue we used for verbal context, infinitival“to,” should engender more specific phrasal restrictions(beyond those of category) than the cue we used forthe noun context, “the,” although it is true that in bothcases we would expect that the context conveys morethan simply category information. In ignoring this, ourmethod for calculating conditional measures was, ofcourse, a heuristic, which was potentially closer to thetruth for nouns than for verbs. Infinitival “to” not onlycues the listener to expect a verb (which is the predictionaccounted for in the way we calculated our conditionalmeasures) but also makes a strong prediction aboutthe morphological complexity of what will follow: inmost cases, a suffix after the root is impossible. Thisadditional potential restriction of the cohort is nottaken into account in our verb-specific form or usage-conditional measures. It should be noted that “to”could in general also introduce a prepositional phrase,though participants knew this was not possible in thecurrent experiment. It is not clear to what extent thistype of knowledge can be expected to influence thecohort.

We used infinitival “to” in this experiment because itcould be made acoustically quite similar to “the” andtherefore allowed us to construct maximally similar audi-tory phrases across conditions. However, our resultsuggests that investigations of either a less restrictiveverbal context (so that the noun and verb cues wouldbe more similar in their requirements) or conditionalmeasures more specifically tailored to the cue usedwould be a useful follow-up. Further investigationcould clarify whether the pattern of interactions foundhere is due to a difference between noun and verb con-texts in general or reflects differences in the fit of ourconditional measures because of the specific, differingpredictions allowed by “the” and “to.” As one steptoward clarifying this, we extracted the bigram frequen-cies of each of our “to + X” and “the + X” phrases from theCorpus of Contemporary American English (Davies,2008), along with the overall frequencies of “to” and“the.” From these frequency values we calculatedlexical surprisal for each target word in each phrase,and compared lexical surprisal between the “to” and

“the” conditions, but there was no significant differencebetween them. This makes it unlikely that our inter-actions with context are due to differences in co-occur-rence probability of the words.

Additional evidence in this vein comes from the factthat the effect that emerged at the end of the wordwas not an interaction. 190 ms after the onset of thefinal phoneme (∼540 ms after the onset of the word), asignificant cluster indicated a facilitatory main effect ofusage-conditional phoneme surprisal in STG. The factthat we did not find interactions with context forusage-conditional phoneme surprisal complicates a con-clusion that the interactions at the second and third pho-nemes are due solely to a failure of our conditionalmeasures to account for restrictions imposed by “to.” Ifthis were the case we would expect weaker correlationsfor the verb conditions across the board.

As for the directionality of the correlations for whichsignificant clusters were found, form-conditionalphoneme surprisal was facilitatory at the secondphoneme and inhibitory at the third, while standard sur-prisal and usage-conditional phoneme surprisal wereboth facilitatory. This is not predicted by any aspect ofthe models described in the Introduction.

The fact that we found no significant effects of stan-dard cohort entropy or phoneme surprisal for the ambig-uous targets deserves acknowledgment as well. Thiscondition most closely approximates the designs of pre-vious studies on whose conclusions this study builds,though the results for this condition do not havebearing on our primary hypothesis. Given the signifi-cantly lower accuracy rates that we found for this con-dition, we attribute the lack of expected response inpart to the unusual nature of the phrases (e.g. “juhache prone”), which may have prevented the ambiguoustargets from being processed in a naturalistic way. Futureinvestigations should consider alternative ways to createacoustically parallel ambiguous and contextualisedconditions.

Finally, we consider the results from our exploratoryanalysis of a broader set of language-related areas inthe left hemisphere. In addition to the clusters in TTGand STG first identified in our ROI analysis, thisbroader test indicated correlation with form-con-ditional surprisal in LATL, and with standard andform-conditional surprisal in IFG, though the lattereffects did not survive correction over both labels.Our hypothesis, along with the majority of precedingwork, has focused on auditory cortex as the locus ofphoneme surprisal and cohort entropy effects. Wetherefore refrain from speculation as to the potentialimplications of LATL or IFG involvement, but, again,note it for future work.

LANGUAGE, COGNITION AND NEUROSCIENCE 19

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

Conclusion

Evidence for contextual restriction of the auditory cohortin the manner we have demonstrated is informative forour understanding of the representations that compriseit, the mechanisms by which it is modulable, and thetime course on which different types of informationcan be integrated and utilised. We found significant influ-ences of form-conditional phoneme surprisal, which issufficient evidence for a single rather than multipleword-forms theory of category homonymy. A paralleleffect of unconstrained surprisal also suggests that thiscohort “restriction” may entail sensitivity to a subset ofcohort members rather than the complete removalfrom consideration of possibilities that are no longer con-sistent with category information. Finally, the relevanceof usage-conditional phoneme surprisal at the end ofthe word indicates a second step of restriction, inwhich word-form probabilities are weighted to takeinto account category-specific frequency.

These findings strengthen the precedent for the rel-evance of phoneme surprisal in word processing, anddemonstrate that manipulations of this variable canbe used to test hypotheses about the structure of thecohort. They also suggest a number of directions forfuture work. Different types of context, as employedin the many prior experiments we have describedhere, should be probed with conditional cohort vari-ables. This will require developing more specifichypotheses about the ways in which complex syntactic,morphosyntactic, and semantic restrictions can betranslated into inclusion (or exclusion) criteria for audi-tory word-forms in the cohort, especially when thesecues do not fit the traditional profile of a lexical feature.

Note

1. During the course of analysis we discovered that MNE-Python versions 0.9 (used originally) and 0.13 (currentat the time of this writing) produced very slight discre-pancies. All results reported in this paper were obtainedwith MNE-Python version 0.13 and eelbrain version 0.25.

Acknowledgements

We are grateful to Laura Gwilliams for data collection andChristian Brodbeck for advice and assistance in implementingthe spatiotemporal regression analysis described here. Wealso thank Tal Linzen for helpful discussion of our conditionalmeasures.

Disclosure statement

No potential conflict of interest was reported by the authors.

Funding

This work was supported by the NYU Abu Dhabi Institute under[grant number G1001].

ORCID

Phoebe Gaston http://orcid.org/0000-0002-0367-7397

References

Adachi, Y., Shimogawara, M., Higuchi, M., Haruta, Y., & Ochiai, M.(2001). Reduction of non-periodic environmental magneticnoise in MEG measurement by continuously adjustedleast squares method. IEEE Transactions on AppiledSuperconductivity, 11(1), 669–672.

Baayen, H., Piepenbrock, R., & Gulikers, L. (1995). CELEX2LDC96L14 [Web download]. Philadelphia, PA: LinguisticData Consortium. Retrieved from https://catalog.ldc.upenn.edu/LDC96L14

Baayen, H., Wurm, L. H., & Aycock, J. (2007). Lexical dynamics forlow-frequency complex words: A regression study acrosstasks and modalities. The Mental Lexicon, 2(3), 419–463.

Balling, L. W., & Baayen, H. (2012). Probability and surprisal inauditory comprehension of morphologically complexwords. Cognition, 125(1), 80–106.

Balota, D. A., Yap, M. J., Hutchison, K. A., Cortese, M. J., Kessler, B.,Loftis, B.,… Treiman, R. (2007). The English lexicon project.Behavior Research Methods, 39(3), 445–459.

Boersma, P., & Weenink, D. (2014). Praat: Doing phonetics bycomputer [Computer software]. Retrieved from http://www.praat.org/

Brainard, D. H. (1997). The psychophysics toolbox. Spatial Vision,10(4), 433–436.

Brodbeck, C. (2017). Eelbrain (0.25). doi:10.5281/zenodo.438193Brysbaert, M., & New, B. (2009). Moving beyond Kučera and

Francis: A critical evaluation of current word frequencynorms and the introduction of a new and improved word fre-quency measure for American English. Behavior ResearchMethods, 41(4), 977–990.

Brysbaert, M., New, B., & Keuleers, E. (2012). Adding part-of-speech information to the SUBTLEX-US word frequencies.Behavior Research Methods, 44(4), 991–997.

Dahan, D., & Magnuson, J. S. (2006). Spoken word recognition.In M. J. Traxler & M. A. Gernsbacher (Eds.), Handbook of psy-cholinguistics (2nd ed., pp. 249–283). Boston, MA: Elsevier.doi:10.1016/B978-012369374-7/50009-2

Dahan, D., Swingley, D., Tanenhaus, M. K., & Magnuson, J. S.(2000). Linguistic gender and spoken-word recognition inFrench. Journal of Memory and Language, 42(4), 465–480.

Dahan, D., & Tanenhaus, M. K. (2004). Continuous mapping fromsound to meaning in spoken-language comprehension:Immediate effects of verb-based thematic constraints.Journal of Experimental Psychology: Learning, Memory, andCognition, 30(2), 498–513.

Dale, A. M., Liu, A. K., Fischl, B. R., Buckner, R. L., Belliveau, J. W.,Lewine, J. D., & Halgren, E. (2000). Dynamic statistical para-metric mapping. Neuron, 26(1), 55–67.

Davies, M. (2008). The corpus of contemporary American English(COCA): 520 million words, 1990-present. Retrieved fromhttps://corpus.byu.edu/coca/

20 P. GASTON AND A. MARANTZ

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

Desikan, R. S., Ségonne, F., Fischl, B., Quinn, B. T., Dickerson, B. C.,Blacker, D.,… Killiany, R. J. (2006). An automated labelingsystem for subdividing the human cerebral cortex on MRIscans into gyral based regions of interest. NeuroImage, 31(3), 968–980.

Ettinger, A., Linzen, T., & Marantz, A. (2014). The role of mor-phology in phoneme prediction: Evidence from MEG. Brainand Language, 129, 14–23.

Frauenfelder, U. H., & Tyler, L. K. (1987). The process of spokenword recognition: An introduction. Cognition, 25(1–2), 1–20.

Gagnepain, P., Henson, R. N., & Davis, M. H. (2012). Temporalpredictive codes for spoken words in auditory cortex.Current Biology, 22(7), 615–621.

Gahl, S. (2008). Time and thyme are not homophones: Theeffect of Lemma frequency on word durations in spon-taneous speech. Language, 84(3), 474–496.

Gaskell, M. G., & Marslen-Wilson, W. D. (1997). Integrating formand meaning: A distributed model of speech perception.Language and Cognitive Processes, 12(5–6), 613–656.

Gramfort, A., Luessi, M., Larson, E., Engemann, D. A., Strohmeier,D., Brodbeck, C.,… Hämäläinen, M. (2013). MEG and EEGdata analysis with MNE-Python. Frontiers in Neuroscience, 7.doi:10.3389/fnins.2013.00267

Gramfort, A., Luessi, M., Larson, E., Engemann, D. A., Strohmeier,D., Brodbeck, C.,… Hämäläinen, M. S. (2014). MNE softwarefor processing MEG and EEG data. NeuroImage, 86(Suppl.C), 446–460.

Grosjean, F. (1980). Spoken word recognition processes andthe gating paradigm. Perception & Psychophysics, 28(4),267–283.

Grosjean, F., Dommergues, J.-Y., Cornu, E., Guillelmon, D., &Besson, C. (1994). The gender-marking effect in spokenword recognition. Perception & Psychophysics, 56(5), 590–598.

Gwilliams, L., Lewis, G. A., & Marantz, A. (2016). Functionalcharacterisation of letter-specific responses in time, spaceand current polarity using magnetoencephalography.NeuroImage, 132(Suppl. C), 320–333.

Gwilliams, L., & Marantz, A. (2015). Non-linear processing of alinear speech stream: The influence of morphological struc-ture on the recognition of spoken Arabic words. Brain andLanguage, 147(Suppl. C), 1–13.

Hale, J. (2001). A probabilistic earley parser as a psycholinguisticmodel. Proceedings of the second meeting of the NorthAmerican Chapter of the Association for ComputationalLinguistics on Language technologies (pp. 1–8). Associationfor Computational Linguistics. doi:10.3115/1073336.1073357

Holmes, A. P., & Friston, K. J. (1998). Generalisability, randomeffects & population inference. NeuroImage, 7, S754–S700.

Kemps, R. J. J. K., Wurm, L. H., Ernestus, M., Schreuder, R., &Baayen, H. (2005). Prosodic cues for morphological complex-ity in Dutch and English. Language and Cognitive Processes,20(1–2), 43–73.

King, J., Linzen, T., & Marantz, A. (to appear). Syntactic cat-egories as lexical features or syntactic heads: An MEGapproach. Linguistic Inquiry. Retrieved from http://ling.auf.net/lingbuzz/002477

Luce, P. A. (1986). A computational analysis of uniquenesspoints in auditory word recognition. Perception &Psychophysics, 39(3), 155–158.

Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: Theneighborhood activation model. Ear and Hearing, 19(1), 1–36.

Luce, P. A., Pisoni, D. B., & Goldinger, S. D. (1990). Similarityneighborhoods of spoken words. In G. Altmann (Ed.),Cognitive models of speech perception: Psycholinguistic andcomputational perspectives (pp. 122–147). Cambridge,MA:MIT Press.

Magnuson, J. S., Dixon, J. A., Tanenhaus, M. K., & Aslin, R. N.(2007). The dynamics of lexical competition during spokenword recognition. Cognitive Science, 31(1), 133–156.

Magnuson, J. S., Tanenhaus, M. K., & Aslin, R. N. (2008).Immediate effects of form-class constraints on spokenword recognition. Cognition, 108(3), 866–873.

Maris, E., & Oostenveld, R. (2007). Nonparametric statisticaltesting of EEG- and MEG-data. Journal of NeuroscienceMethods, 164(1), 177–190.

Marslen-Wilson, W. D. (1980). Speech understanding as apsychological process. In J. C. Simon (Ed.), Spoken languagegeneration and understanding (pp. 39–67). Dordrecht:Springer. doi:10.1007/978-94-009-9091-3_2

Marslen-Wilson, W. D. (1987). Functional parallelism in spokenword-recognition. Cognition, 25(1), 71–102.

Marslen-Wilson, W. D., & Welsh, A. (1978). Processing inter-actions and lexical access during word recognition in con-tinuous speech. Cognitive Psychology, 10(1), 29–63.

McAllister, J. M. (1988). The use of context in auditory word rec-ognition. Perception & Psychophysics, 44(1), 94–97.

McClelland, J. L., & Elman, J. L. (1986). The TRACE model ofspeech perception. Cognitive Psychology, 18(1), 1–86.

McQueen, J. M. (2007). Eight questions about spoken-word rec-ognition. In G. M. Gaskell (Ed.), The Oxford handbook of psy-cholinguistics (pp. 37–53). Oxford: Oxford University Press.doi:10.1093/oxfordhb/9780198568971.013.0003

Moscoso del Prado Martín, F., Kostić, A., & Baayen, H. (2004).Putting the bits together: An information theoretical per-spective on morphological processing. Cognition, 94(1), 1–18. doi:10.1016/j.cognition.2003.10.015

Moseley, R. L., & Pulvermüller, F. (2014). Nouns, verbs, objects,actions, and abstractions: Local fMRI activity indexes seman-tics, not lexical categories. Brain and Language, 132, 28–42.doi:10.1016/j.bandl.2014.03.001

Norris, D. (1994). Shortlist: A connectionist model of continuousspeech recognition. Cognition, 52(3), 189–234.

Norris, D., & McQueen, J. M. (2008). Shortlist B: A Bayesian modelof continuous speech recognition. Psychological Review, 115(2), 357–395.

Oldfield, R. C. (1971). The assessment and analysis of handed-ness: The Edinburgh inventory. Neuropsychologia, 9(1), 97–113.

Pelli, D. G. (1997). The VideoToolbox software for visual psycho-physics: Transforming numbers into movies. Spatial Vision, 10(4), 437–442.

Shannon, C. E. (1948). A mathematical theory of communi-cation. The Bell System Technical Journal, 27(3), 379–423.

Shillcock, R. C., & Bard, E. G. (1993). Modularity and the proces-sing of closed-class words. In G. Altmann, & R. C. Shillcock(Eds.), Cognitive models of speech processing: The second sper-longa meeting (pp. 163–185). East Sussex: Lawrence ErlbaumAssociates.

Strand, J., Simenstad, A., Cooperman, A., & Rowe, J. (2014).Grammatical context constrains lexical competition inspoken word recognition. Memory & Cognition, 42(4), 676–687.

LANGUAGE, COGNITION AND NEUROSCIENCE 21

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17

Swinney, D. A. (1979). Lexical access during sentence compre-hension: (Re)consideration of context effects. Journal ofVerbal Learning and Verbal Behavior, 18(6), 645–659.

Swinney, D. A., & Love, T. (2002). Context effects on lexical pro-cessing during auditory sentence comprehension. In E.Wintruk, A. D. Friederici, & T. Lachmann (Eds.), Basic functionsof language, reading and reading disability (pp. 25–40).Boston, MA: Springer. doi:10.1007/978-1-4615-1011-6_3

Tanenhaus, M. K., & Donnenwerth-Nolan, S. (1984). Syntacticcontext and lexical access. The Quarterly Journal ofExperimental Psychology Section A, 36(4), 649–661.

Tanenhaus, M. K., Leiman, J. M., & Seidenberg, M. S. (1979).Evidence for multiple stages in the processing of ambiguouswords in syntactic contexts. Journal of Verbal Learning andVerbal Behavior, 18(4), 427–440.

Tyler, L. K. (1984). The structure of the initial cohort: Evidencefrom gating. Perception & Psychophysics, 36(5), 417–427.

Tyler, L. K., Randall, B., & Stamatakis, E. A. (2008). Corticaldifferentiation for nouns and verbs depends on

grammatical markers. Journal of Cognitive Neuroscience,20(8), 1381–1389.

Tyler, L. K., & Wessels, J. (1983). Quantifying contextual contri-butions to word-recognition processes. Perception &Psychophysics, 34(5), 409–420.

Vigliocco, G., Vinson, D. P., Druks, J., Barber, H., & Cappa, S. F. (2011).Nouns and verbs in the brain: A review of behavioural, electro-physiological, neuropsychological and imaging studies.Neuroscience & Biobehavioral Reviews, 35(3), 407–426.

Wurm, L. H., Ernestus, M. T. C., Schreuder, R., & Baayen, H. (2006).Dynamics of the auditory comprehension of prefixed words:Cohort entropies and conditional root uniqueness points.The Mental Lexicon, 1(1), 125–146.

Yuan, J., & Liberman, M. (2008). Speaker identification on theSCOTUS corpus. The Journal of the Acoustical Society ofAmerica, 123(5), 3878–3878.

Zwitserlood, P. (1989). The locus of the effects of sentential-semantic context in spoken-word processing. Cognition, 32(1), 25–64.

22 P. GASTON AND A. MARANTZ

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

5:32

16

Nov

embe

r 20

17