17
CHAPTER Speech Perception in Infancy: A Foundation for Language Acquisition 1 Janet F. Werker and Judit Gervain Key Points 1. Newborns have sophisticated speech perception abilities. ey prefer language over complex speech analogs, discriminate rhythmically dierent languages, prefer to listen to their native language, detect word boundaries, and discriminate most phonemes of the world’s languages. ese perceptual abilities lay the foundations for subsequent learning. 2. Perceptual biases present at birth prepare the infant to learn the sound system of any of the world’s languages, and perceptual learning, using the statistical properties of the input speech, helps establish the repertoire of native speech sound categories. 3. Infants start to segment the continuous speech stream into constituent words during the second half of the rst year. Initially, they seem to rely primarily on statistical information such as the co-occurrence patterns of phonemes and syllables, 31 Abstract We discuss the development of speech perception and its contribution to the acquisition of the native language(s) during the first year of life, reviewing recent empirical evidence as well as current theoretical debates. We situate the discussion in an epigenetic framework in an attempt to transcend the traditional nature/nurture controversy. As we illustrate, some perceptual and learning mechanisms are best described as experience-expectant processes, embedded in our biology and awaiting minimal environmental input, while others are experience-dependent, emerging as a function of sufficient exposure and learning. We argue for a cascading model of development, whereby the initial biases guide learning and constrain the influence of the environmental input. To illustrate this, we first review the perceptual abilities of newborn infants, then discuss how these broad-based abilities are attuned to the native language at different levels (phonology, syntax, lexicon etc.). Key Words: phonological/prosodic bootstrapping; experience-expectant and experience-dependent mechanisms; neonates; phoneme perception; word learning; linguistic rhythm; perceptual attunement; rule learning; word order; epigenetics which they can extract using general-purpose, universal learning mechanisms. 4. Toward the end of the rst year, as they gain more experience with the native language, infants increasingly rely on language-specic cues to word segmentation, such as stress, phonotactics, or phoneme allophony. 5. As they extract an increasing number of potential word forms from the input, infants start to associate these with concrete, perceptually available objects in the environment. is initial associative labeling is the rst step toward word learning and the lexicon. Soon after the rst birthday, the main cognitive focus of the infant changes from phoneme to word learning. is functional reorganization allows infants to use phoneme-level representations, ignoring other acoustic details that are not relevant for lexical distinctions. 6. During this process, the two levels (i.e., the presentations of word forms and of objects) CHAPTER OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN 31_Zelazo-V1_Ch31.indd 909 31_Zelazo-V1_Ch31.indd 909 9/12/2012 8:42:54 PM 9/12/2012 8:42:54 PM

C H A P T E R 31 Speech Perception in Infancyinfantstudies-psych.sites.olt.ubc.ca/files/2015/07/Werker_Gervain_2013.pdf(1987), we use the label “experience-expectant” to characterize

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • C H A P T E R

    909

    Speech Perception in Infancy: A Foundation for Language Acquisition 1

    Janet F. Werker and Judit Gervain

    Key Points 1. Newborns have sophisticated speech

    perception abilities. ! ey prefer language over complex speech analogs, discriminate rhythmically di" erent languages, prefer to listen to their native language, detect word boundaries, and discriminate most phonemes of the world’s languages. ! ese perceptual abilities lay the foundations for subsequent learning.

    2. Perceptual biases present at birth prepare the infant to learn the sound system of any of the world’s languages, and perceptual learning, using the statistical properties of the input speech, helps establish the repertoire of native speech sound categories.

    3. Infants start to segment the continuous speech stream into constituent words during the second half of the # rst year. Initially, they seem to rely primarily on statistical information such as the co-occurrence patterns of phonemes and syllables,

    31

    Abstract

    We discuss the development of speech perception and its contribution to the acquisition of the native language(s) during the first year of life, reviewing recent empirical evidence as well as current theoretical debates. We situate the discussion in an epigenetic framework in an attempt to transcend the traditional nature/nurture controversy. As we illustrate, some perceptual and learning mechanisms are best described as experience-expectant processes, embedded in our biology and awaiting minimal environmental input, while others are experience-dependent, emerging as a function of sufficient exposure and learning. We argue for a cascading model of development, whereby the initial biases guide learning and constrain the influence of the environmental input. To illustrate this, we first review the perceptual abilities of newborn infants, then discuss how these broad-based abilities are attuned to the native language at different levels (phonology, syntax, lexicon etc.).

    Key Words: phonological/prosodic bootstrapping; experience-expectant and experience-dependent mechanisms; neonates; phoneme perception; word learning; linguistic rhythm; perceptual attunement; rule learning; word order; epigenetics

    which they can extract using general-purpose, universal learning mechanisms.

    4. Toward the end of the # rst year, as they gain more experience with the native language, infants increasingly rely on language-speci# c cues to word segmentation, such as stress, phonotactics, or phoneme allophony.

    5. As they extract an increasing number of potential word forms from the input, infants start to associate these with concrete, perceptually available objects in the environment. ! is initial associative labeling is the # rst step toward word learning and the lexicon. Soon after the # rst birthday, the main cognitive focus of the infant changes from phoneme to word learning. ! is functional reorganization allows infants to use phoneme-level representations, ignoring other acoustic details that are not relevant for lexical distinctions.

    6. During this process, the two levels (i.e., the presentations of word forms and of objects)

    C H A P T E R

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 90931_Zelazo-V1_Ch31.indd 909 9/12/2012 8:42:54 PM9/12/2012 8:42:54 PM

  • 910 Speech Perception in Infancy

    interact. Naming objects acts as an invitation to form conceptual categories for these objects, while evidence of a meaning contrast can strengthen the establishment of language-speci# c phoneme categories.

    7. Sensitivity to language structure appears early on. At birth, infants are able to detect identity relations, and they show evidence of rule learning and structural generalizations in the second half of the # rst year.

    8. ! e perception of the acoustic and phonological properties of speech also plays a role in acquiring language structure, as certain morphosyntactic constructions have phonological and prosodic correlates. Infants are able to exploit these correlations to bootstrap abstract grammatical structure from perceptually available cues.

    9. As an example of prosodic bootstrapping, infants are able to use cues such as intensity, pitch, and duration to establish an initial, rudimentary representation of the basic word order of their native language(s), which has well-de# ned prosodic correlates.

    10. Overall, early language acquisition can be characterized by an epigenetically determined set of initial biases, on which speci# c language experience can build.

    Introduction Acquiring language is one of the most profound

    feats of human childhood. Although the potential for language acquisition is deeply embedded in our biology, it is the native language or languages that the child will acquire. Long before they understand or utter their # rst word, children are listening to and watching speakers of their native language. In this chapter we review evidence showing how initial biases and perceptual tuning work together to help launch language acquisition. We situate the evidence within an epigenetic perspective (Gottlieb, 1976). We use this term in an attempt to progress beyond the nature/nurture controversy: genetic underpin-nings constrain the range of capabilities that can be manifest, yet the environment is equally important via selection and timing of activation and inactiva-tion of genes that code ultimately for behavior. Using terms introduced by Greenough, Black, and Wallace (1987), we use the label “experience-expectant” to characterize those capabilities that are simply wait-ing for almost guaranteed environmental input to be set (e.g., some aspects of language development or

    the maturation of vision over the # rst few months of life), and we use the label “experience-dependent” to characterize those capabilities that emerge only as a function of learning (e.g., the ability to write or read or the acquisition of the vocabulary of a spe-ci# c language).

    To explain the link between initial biases, experience, and the emergence of an abstract, rule-governed language system, we argue for a cas-cading model of development. ! e biases present at the beginning or any subsequent juncture in devel-opment direct attention and processing and con-strain the in3 uence environmental input can have. In this way, changes in perceptual sensitivities that occur across time play a central role in the emer-gence of the language-speci# c representations and structures used in the native language.

    We begin our characterization with an examina-tion of the perceptual biases evident in the newborn infant for attending to speech, for discriminating acoustic/phonetic detail, and for apprehending structure. Because the peripheral auditory system is fully developed by 24 to 28 weeks of gestation and some of the properties of language can be processed in utero , by birth the infant has already had several weeks of experience with the native language. We review the ways in which this prenatal listening experience has changed perception, and consider it as an example of “experience-dependent” processes. Core aspects of language processing remain con-stant, however, even across wide varieties of language input. ! ese will also be described. To capture the biological fact that even these tightly conserved ini-tial sensitivities re3 ect the joint action of genes and gene activation/inactivation, we will use the term “experience-expectant” to characterize them, but it should be noted that this set of capabilities is largely akin to that referred to as “innate” by others.

    From these beginnings, we review the litera-ture on infant language perception across the # rst months of life, focusing on three broad aspects of language: phonetics/phonology, word learning, and the bootstrapping e" ects of speech perception on morphosyntactic structure.

    Newborns’ Speech Perception Abilities: Experience-Expectant and Experience-Dependent

    At birth, human infants show remarkably e4 -cient language-learning propensities that likely involve a combination of both experience-expectant and experience-dependent processes. Newborns prefer to listen to speech over similarly complex

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 91031_Zelazo-V1_Ch31.indd 910 9/12/2012 8:42:54 PM9/12/2012 8:42:54 PM

  • 911Werker, Gervain

    nonspeech sounds (Vouloumanos & Werker, 2004, 2007a). If the speech is # ltered to render it more like that which is heard in utero , newborns no lon-ger show a preference (Vouloumanos & Werker, 2007b). Moreover, they do show an equal prefer-ence for rhesus monkey calls—a signal they have never before heard—as they do for human speech (Vouloumanos, Hauser, Werker, & Martin, 2010), and prefer both over nonspeech. Although these two sets of # ndings cannot rule out a role for listen-ing experience in utero , they do provide compelling evidence that the neonatal preference for speech is not merely based on familiarity, raising the possibil-ity that the human infant is broadly prepared for sounds that are speechlike. By 3 months, human infants show a robust preference for human speech over monkey calls (Vouloumanos et al., 2010), revealing experience-based learning after birth.

    Neonates’ brains also appear to be specialized for language, as they show an increased response in the left auditory areas to normal, forward-going speech, but not to backward speech (Dehaene-Lambertz, Dehaene, & Hertz-Pannier, 2002; Pe ñ a et al., 2003). ! is lateralization is seen in both struc-tural and functional measures and is similar to the brain organization found in most (right-handed) adults, whose language areas are predominantly located in the left hemisphere (Dehaene-Lambertz, Hertz-Pannier, Dubois, & Dehaene, 2008). ! is initial predisposition for human language sets the stage for learning by allowing infants to maximize their exposure to speech.

    Importantly, the evidence for structural dif-ferences in the posterior part of the left superior temporal cortex so early in life (Dehaene-Lambertz et al., 2008) has been tied to di" erences in gene expression in the perisylvian areas of the brain (Sun, Collura, Ruvolo, & Walsh, 2006). ! is may, in concert with other developmental processes, lead to structural di" erences that support the early devel-opment of cortical specialization. ! e fact that these di" erences in gene expression predate even prenatal experience with languages helps elucidate what we mean by an epigenetic approach. We are not claim-ing that speci! c experience with language is required to set up the perceptual biases seen. Rather, we are emphasizing that even genes whose expression is tightly constrained nevertheless require enabling conditions—in this case likely endogenous not only to the organism but perhaps also to the cell—to be expressed. We believe this point is important to make because it underscores how nature and nur-ture must be simultaneously considered to describe

    how biological systems work, thus illustrating how counterproductive it is to pit nature and nurture against one another.

    Newborns also show evidence of prenatal learn-ing. ! ey show a speci# c preference for their native language(s)—that is, for the language(s) their moth-ers spoke during pregnancy (Mehler et al., 1988; Moon, Cooper, & Fifer, 1993). And they show a preference for their mother’s voice (DeCasper & Fifer, 1980). But yet again, experience-expectant processes play a role as well. Newborns go beyond simple familiarity with a language, as they can dis-criminate languages they never heard before if those are rhythmically di" erent (Mehler et al., 1988; Nazzi, Bertoncini, & Mehler, 1998). A baby born into a French family can discriminate English from Japanese, for instance. ! is discrimination ability relies on the di" erent rhythmic properties (pro-portion of vowels, variability of consonant cluster duration, etc.) of the languages, and it is preserved even if phonological cues other than rhythm (e.g., segment identity) are removed from the stimuli (Ramus, 2002; Ramus, Hauser, Miller, Morris, & Mehler, 2000; Ramus & Mehler, 1999). Recent evi-dence reveals that bilingual newborns do not show a preference for listening to one of their languages over the other even if the languages are from dif-ferent rhythmical classes (Byers-Heinlein, Burns, & Werker, 2010). ! ey do, nonetheless, retain the ability to discriminate them (Byers-Heinlein et al., 2010). ! e experience-dependent listening prefer-ence ensures attention to each language, while the remaining experience-expectant ability to discrimi-nate the two languages may help bilingual infants separate the two systems from the start (Ramus, 2002; Ramus & Mehler, 1999; Ramus et al., 2000). Bilinguals growing up with two rhythmically similar languages might use cues other than rhythm to tell their languages apart, as Catalan–Spanish bilinguals have been found to discriminate the two languages at about 4 months of age (Bosch & Sebasti á n-Gall é s, 1997, 2001).

    Recent work shows the same evidence of experience-dependent processes complement-ing experience-expectant ones in neural organiza-tion. In a recent study, we extended the work by Dehaene-Lambertz and colleagues (2002) and Pe ñ a and colleagues (2003) reviewed three paragraphs above showing a specialized response for forward versus backward speech in the newborn brain, but found it to be particularly strong for the native language—that is, the language heard in utero or immediately following birth (May, Byers-Heinlein,

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 91131_Zelazo-V1_Ch31.indd 911 9/12/2012 8:42:54 PM9/12/2012 8:42:54 PM

  • 912 Speech Perception in Infancy

    Gervain, & Werker, 2011). In our initial work, infants were tested on English versus Tagalog, a Filipino language. Moreover, # ltered speech was used, raising the possibility that the signal was not su4 ciently veridical to fully engage language pro-cessing. However, we are # nding the same pattern in an ongoing study with English and Spanish (May, Gervain, Carreiras, & Werker, 2011).

    Neonates are also sensitive to a number of more subtle phonological cues within a language. ! ey can detect the acoustic correlates of word bound-aries (Christophe, Dupoux, Bertoncini, & Mehler, 1994), as they respond di" erentially to phoneme sequences that are di" erent only in the presence or absence of a word boundary (e.g., the phoneme sequence /mati/ from the single French word math é- maticien [“mathematician”] or spanning two words in panorama typique [“typical view”]). ! ey are also sensitive to the position of stress within a word and can discriminate words with di" erent stress pat-terns (Sansavini, Bertoncini, & Giovanelli, 1997). Furthermore, they can categorize function words and content words on the basis of their di" erent acoustic properties (Shi, Werker, & Morgan, 1999), since functors are phonologically minimal (short, bear no stress, have reduced vowels, etc.; Morgan, Shi, & Allopenna, 1996).

    Adult perception organizes speech into hier-archical units: phonological phrases consisting of phonological words, which in turn comprise pho-nemes, etc. We do not completely understand what units infants use to represent speech. ! e existing evidence suggests that they are able to represent syl-lables and weigh this unit more strongly than the phoneme when representing larger units (Mehler, Dommergues, Frauenfelder, & Segui, 1981; Segui, Dupoux, & Mehler, 1992). In these experiments, infants distinguished words that consisted of two and three syllables, but an equal number of pho-nemes, yet failed to show discrimination when pre-sented with words consisting of an equal number of syllables, but a di" erent number of phonemes. ! is, of course, does not mean that newborns and young infants are unable to discriminate individual pho-nemes. In fact, the contrary is true. As the next two sections will discuss, newborns and young infants have remarkable phoneme-discrimination abilities. However, they seem to be using di" erent units for di" erent purposes.

    As recent brain imaging data suggest (Gervain, Macagno, Cogoi, Pe ñ a, & Mehler, 2008), new-borns are also sensitive to the structural regularities in speech. ! ey exhibit increased brain activity in

    response to simple patterns, such as immediate and identical repetitions of the same syllables within words (e.g., mubaba, penana , etc.), but they can-not detect distant repetitions (e.g., bamuba , napena , etc.). (For a more detailed discussion, see the section entitled “Learning Abstract Regularities: Rules and Perceptual Mechanisms” and Fig. 31.2.)

    ! ese perceptual abilities allow very young infants to start cracking the linguistic code by pro-viding e4 cient mechanisms to keep track of their target language(s) and obtain a # rst parse of the continuous input into constituent units such as syl-lables, words, utterances, etc. ! is lays the founda-tions for subsequent learning.

    From Broad-Based Acoustic/Phonetic Perception to Language-Specifi c Phonetic Perception

    Infants begin life with broad-based sensitivities to phonetic contrasts. For example, infants discrim-inate similar-sounding vowels (e.g., /i/ and /I/ as in the words “beat” and “bit”) and similar-sounding consonants (e.g., the /b/ vs. /d/ distinction in “big” and “dig” and the /b/ vs. /p/ distinction as in “bat” and “pat”). More importantly, the perceptual sensi-tivities shown by young infants map onto the pho-netic di" erences that are used in adult languages. At 1 to 4 months infants discriminate the same-sized acoustic di" erence more easily if the two stimuli on which they are being tested are from either side of a phonetic category boundary used in an adult language (e.g., /b/ vs. /p/) than they do if the two stimuli are from within the same phonetic category (e.g., two physically distinct instances of /p/; Eimas, Siqueland, Jusczyk, & Vigorito, 1971). Moreover, there is organization to the phonetic space of young infants. Young infants will treat as equivalent di" er-ent pronunciations of the same consonant or vowel when spoken by di" erent voices (Dehaene-Lambertz & Gliga, 2004; Kuhl, 1979; 1980), paying more attention to the phonetic category boundary than to the change in speaker.

    ! e sensitivities shown by young infants map onto an “experience-expectant” pattern, with an organi-zation that supports the learning of the phonetic categories of any particular language. During the # rst several months of life, “experience-dependent” listening functions to attune these sensitivities in a number of ways. ! e most commonly described pattern is one of maintenance/decline. ! is was # rst illustrated by Werker and Tees (1984), who showed that English infants aged 6 to 8 months can discriminate a di" erence between two Hindi “d”

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 91231_Zelazo-V1_Ch31.indd 912 9/12/2012 8:42:54 PM9/12/2012 8:42:54 PM

  • 913Werker, Gervain

    sounds that adult English speakers can no longer discriminate. By 10 to 12 months of age the English infants no longer maintained sensitivity to the dis-tinction, although Hindi-learning infants did. ! is pattern has now been replicated many times with a variety of other consonant and vowel distinctions (Gervain & Werker, 2008), using both behav-ioral and event-related potential (ERP) tasks, with the reorganization in vowel perception seen at a slightly younger age than it is for consonants (Kuhl, Williams, Lacerda, Stevens, & Lindblom, 1992; Polka & Werker, 1994). Moreover, the pattern has even been shown to extend to sign language (Baker, Golinko" , & Petitto, 2006; Palmer, Fais, Golinko" , & Werker, 2012; Wilbourn & Casasola, 2007). Of interest, bilingual infants ultimately maintain sen-sitivity to the phonetic distinctions used in each of their native languages (Bosch & Sebasti á n-Gall é s, 2003; Burns, Yoshida, Hill, & Werker, 2007; Sebasti á n-Gall é s & Bosch, 2009; Sundara, Polka, & Moln á r, 2008). ! ose consonant and/or vowel distinctions that are used in the native language are not only maintained but are also sharpened, with listening experience improving (Kuhl et al., 2006; Sundara, Polka, & Genesee, 2006) and realigning (Burns et al., 2007) discrimination.

    Finally, recent research suggests that acoustic salience may also be important. In illustration, although infants divide the nasal consonant space without experience and can discriminate /ma/ from /na/ at a very young age, listening experience may be required to induce discrimination of the acousti-cally similar and fairly rare (i.e., nonsalient) distinc-tion between /n/ and /ng/ in initial position. ! is distinction is used in Filipino. At 4 and 6 months, neither English nor Filipino infants can discrimi-nate /na/ from /nga/. By 10 months discrimination is evident in Filipino infants but not in English infants (Narayan, Werker, & Beddor, 2010).

    How might infants learn the phonetic categories of their native language? It has been argued that just as in other types of perceptual learning, infants use a form of similarity matching (Best & McRoberts, 2003; Kuhl, 2004). Basically, if Language A has a single phonetic category in an acoustic space where another language, say Language B, has two categories, infants learning Language A may begin to collapse the two categories into the single inter-mediate category, whereas infants in Language B will continue to maintain the distinction. To empirically test this type of explanation, Maye, Werker, and Gerken (2002) conducted an arti# -cial language-learning study in which infants were

    familiarized to either a unimodal (mimicking Language A above) or bimodal (like Language B) set of stimuli from along an acoustic continuum and then tested on their ability to discriminate the endpoints. Following only 2.4 minutes of exposure, infants in the bimodal but not the unimodal condi-tion continued to discriminate the two endpoints, indicating that only the infants in the bimodal condition continued to treat the input as two cat-egories. ! is pattern has now been replicated with a new continuum and has been shown to help infants generalize to a di" erent phonetic distinction, and to help induce discrimination of an otherwise di4 cult contrast (Maye, Weiss, & Aslin, 2008). Of impor-tance, although distributional learning is e" ective at 6 to 8 months of age, and continues to be so across the lifespan (Maye & Gerken, 2001), distributional learning is measurably less e" ective by 10 months of age and beyond, after infants have already begun to establish native phonetic categories (Yoshida, Pons, Maye, & Werker, 2010a). At this time social inter-action appears to facilitate learning (Kuhl, Tsao, & Liu, 2003), perhaps due to its impact on attention (Yoshida et al., 2010a).

    In summary, the research on phonetic percep-tion in infancy indicates a substantial level of orga-nization prior to signi# cant postnatal listening experience. ! e perceptual biases that are evident prepare the infant to learn the sound system of any of the world’s languages, and perceptual learning, using the statistical properties of the input speech, helps establish the repertoire of native speech sound categories. In the section entitled “! e Beginnings of Meaning: Associating Word Forms with Objects” we review the literature on how perceptual tuning to the phonetic categories of the native language might support recognizing and learning the words of the native language. But before infants can recog-nize familiar words and learn new words, they have to be able to extract words from the ongoing stream of speech. Hence we turn to a discussion of word segmentation in the next section.

    Segmenting, Representing, and Recalling Word Forms

    One of the most challenging tasks a young infant faces is to break the continuous speech stream into its constituent units. Speech does not have the sys-tematic and reliable cues to word boundaries that we # nd in written language: not every word bound-ary is signaled by pauses, and words rarely appear in isolation. ! erefore, a growing body of research has been directed to understanding how babies might

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 91331_Zelazo-V1_Ch31.indd 913 9/12/2012 8:42:54 PM9/12/2012 8:42:54 PM

  • 914 Speech Perception in Infancy

    # nd the word forms in ongoing speech. Two types of cues have been proposed. Universal cues rely on the statistical properties of speech, which are believed to work in all natural languages in approximately the same way, so these cues can guide infants in seg-menting speech from early on. Language-speci# c cues require some learning about the phonologi-cal properties of the target language, and are thus assumed to be operational at a later stage of speech segmentation and word learning.

    ! e most widely recognized statistical cue to segmentation is conditional probability—that is, the probability of a unit (syllable, phoneme, etc.) to appear before or after another unit. Phonemes or syllables within the same word predict each other with a higher conditional probability than two units spanning a word boundary (Harris, 1955; Shannon, 1948). For instance, in the sequence pretty baby , the forward-going transitional probability 1 , a type of conditional probability measure, is higher between the syllables pre and tty than between tty and ba , since the word pretty can be followed by a large number of words, making it hard to predict the # rst syllable of the next word. Building on a seminal study by Hayes and Clark (1970), where adults were shown to be able to segment a continuous stream of arti# cial speech analogs on the basis of conditional probabilities, Sa" ran, Aslin, and Newport (1996) found that 8-month-old infants have the same abil-ity, successfully segmenting a continuous arti# cial language into its constituent words using statistical cues after only a few minutes of exposure. ! is dis-covery gave rise to a large number of subsequent studies that aimed to explore the nature of this sta-tistical segmentation mechanism (Bonatti, Pe ñ a, Nespor, & Mehler, 2005; Endress & Bonatti, 2007; Fiser & Aslin, 2002, 2005; Hauser, Newport, & Aslin, 2001; Nazzi, Paterson, & Karmilo" -Smith, 2003; Newport & Aslin, 2004; Newport, Hauser, Spaepen, & Aslin, 2004; Pe ñ a, Bonatti, Nespor, & Mehler, 2002; Sa" ran & Wilson, 2003; ! iessen & Sa" ran, 2003; Toro & Trobalon, 2005; Yang, 2004).

    Interestingly, this statistical learning ability is triggered by speci# c aspects of the signal such as its continuous, nonsegmented nature. In a study with adults Pe ñ a and colleagues (2002) showed that adults readily compute statistics over adjacent and nonadjacent syllables, using this information to # nd words in the input, if the speech stream is continu-ous, but extract and generalize structural regulari-ties if the speech stream is segmented by subliminal pauses. Recently, Marchetto and Bonatti (under

    revision) have shown that 12-month-old (but not 7-month-old) infants have the same sensitivity to the properties of the input stream, # nding words in continuous streams, but extracting structural regu-larities when the input is presegmented. ! is might re3 ect a division of labor between mechanisms used to learn words and those used to acquire the mor-phological structure of a language, since these are intimately related, especially in morphologically rich languages, where learning a word might require a prior morphological decomposition into constitu-ent morphemes.

    As infants gain increasing knowledge about their native language, they start using some of the phonological cues correlated with word boundaries within the target language. At least three such cues have been identi# ed: stress, phonotactics, and allo-phony. In English, for instance, content words typi-cally have a strong–weak (trochaic) stress pattern (e.g., ‘doc tor ). Consequently, infants might use a stressed syllable at the beginning of a bisyllabic unit as a cue to segmentation. Indeed, infants have been shown to develop sensitivity to the stress patterns of their native language between 6 and 9 months (Jusczyk & Aslin, 1993; Morgan, 1996; Morgan & Sa" ran, 1995). ! e Metrical Segmentation Strategy (Cutler, 1994; Cutler & Carter, 1987), which relies on stressed-based cues, has been shown to under-lie 7.5-month-old English-learning infants’ rec-ognition of familiar words. Jusczyk and colleagues (Jusczyk, Houston, & Newsome, 1999) have shown that when familiarized with trochaic English words (e.g., ‘doc tor , ‘can dle ), 7.5-month-olds prefer pas-sages containing these words over passages that do not contain them. ! is preference is speci# c to the trochaic word form. Passages containing the # rst strong syllables of the words (e.g., dock , can ) did not induce any preference. Infants are also able to apply these cues to an ongoing speech stream to extract words. When presented with a continuous stream of consonant vowel (CV) syllables where every third syllable was stressed, 7- and 9-month-olds treated as familiar only those trisyllabic sequences that had initial stress (strong weak weak). Infants showed no recognition of trisyllabic sequences that were not trochaic (WSW or WSS; Curtin, Mintz, & Byrd, 2001). Another prediction of the metric segmentation strategy is that weak–strong (i.e., iam-bic) words (e.g., gui ’tar ) might initially be misseg-mented. Jusczyk, Houston, and colleagues (1999) found exactly this pattern of results.

    Phonotactics (i.e., the regularities guiding pos-sible and impossible phoneme combinations in a

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 91431_Zelazo-V1_Ch31.indd 914 9/12/2012 8:42:55 PM9/12/2012 8:42:55 PM

    judit

  • 915Werker, Gervain

    language) provide another language-speci# c cue to segmentation. Knowing that the sequence /br/ is frequent in the initial positions of English words, while /nt/ is common in # nal positions, can help infants detect word boundaries. Sa" ran and ! iessen (2003) tested the acquisition of phonotactic con-straints using segmentation as the experimental task. During the familiarization phase, they taught infants a rule concerning permissible syllable forms (words consisted of only CV or only CVC syllables; e.g., boda and bikrub , respectively) and a rule relat-ing to permissible consonantal distributions (voiced stop consonants in syllable onsets, unvoiced stops in codas; e.g., dakdot , or vice versa). ! ey found suc-cessful segmentation in the test phase in both cases. Using a di" erent approach, Mattys, Jusczyk, Luce, and Morgan (1999) explored how 9-month-old English-learning infants’ knowledge of English pho-notactics helps them posit word boundaries. ! ey familiarized English-speaking infants with nonsense CVCCVC words, in which the CC cluster is either frequent word-internally but infrequent across word boundaries (e.g., / [ k/), or vice versa (e.g., / [ t/). Infants segmented the nonsense words into two monosyllables when the CC cluster was infrequent word-internally and frequent across word boundar-ies. No segmentation was observed for the other type of CC clusters. Taken together, these results suggest that 9-month-old infants can use their knowledge of the native phonotactics to assist them in word segmentation (Mattys & Jusczyk, 2001a, 2001b).

    A third language-speci# c segmentation cue comes from the distributions of allophones in di" er-ent positions within words. In English, for instance, aspirated stop consonants appear in the initial posi-tions of stressed syllables (Church, 1987), and their unaspirated allophones appear elsewhere. ! is dis-tribution makes aspirated stops good cues to word beginnings. It is not implausible that infants might use such cues for segmentation, because infants as young as 2 months are able to discriminate the dif-ferent allophones of a phoneme (Hohne & Jusczyk, 1994). Indeed, Jusczyk, Hohne, and Bauman (1999) have shown that at 9 months, infants are able to posit word boundaries (e.g., night rates vs. nitrates ) based on allophonic and distributional cues, and at 10.5 months, allophonic cues alone were found to be su4 cient for successful segmentation.

    ! e Beginnings of Meaning: Associating Word Forms with Objects

    In addition to extracting word forms from continuous speech, infants also face the (possibly

    related) challenge of associating these word forms with meanings to build a full vocabulary. To achieve this, infants need to be able to ultimately form stable, detailed, and permanent representations of the word forms they segmented out from speech, and they need to have the ability to link these forms to entities in the world (i.e., to treat word forms as having a referent). ! is requires sophisticated con-ceptual abilities, such as categorization, understand-ing referentiality, solving the induction problem for meaning (Quine, 1960), etc. ! ese are issues that we are not discussing here. Rather, we will concentrate on the simplest and developmentally earliest case when a word gets reliably associated with a percep-tually available, concrete object. Importantly, while learning the word forms and selecting the right meanings have always been regarded as challenging for learners, associating already-learned words with already-existing meanings/concepts was for a long time regarded as fairly straightforward, and espe-cially not involving too much interaction or read-justment between the two levels. In the past decade, however, new evidence emerged showing both that labeling can induce object categorization that doesn’t happen in the absence of labels (Waxman, 2001; Waxman & Lidz, 2006) and that the presence or absence of objects in3 uences the amount of pho-netic detail that infants use to represent word forms (Stager & Werker, 1997; Werker, Fennell, Corcoran, & Stager, 2002; Yeung & Werker, 2009).

    One of the # rst studies to suggest that simply learning language-speci# c phonetic representations might not be su4 cient for directing (associative) word learning was Stager and Werker (1997). ! e authors used a minimal phonetic contrast, bih and dih , and tested whether infants detected this con-trast in a labeling and a discrimination situation. Infants at 14 months were able to discriminate the two words when those were presented with a non-nameable (non-object-like) visual display (a check-erboard), so the context required discrimination, whereas they failed to distinguish the two words in a labeling context, when each was # rst associ-ated with a di" erent object, and then the labels were switched around. In fact, 14-month-olds, but not 8-month-olds (Stager & Werker, 1997) or 17-month-olds (Werker et al., 2002), failed to notice this minimal switch, but even 14-month-olds succeeded when the two labels were more distinct (i.e., lif and neem ).

    ! e authors interpreted these results as a sign of functional reorganization, as the main cognitive focus of the infant changes from speech perception

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 91531_Zelazo-V1_Ch31.indd 915 9/12/2012 8:42:55 PM9/12/2012 8:42:55 PM

    judit

    judit

  • 916 Speech Perception in Infancy

    to word learning. During the developmental pro-cess, language-speci# c representations are employed di" erently in di" erent contexts. In an associative word-learning task the infant needs to pay attention to the details of both the word form and of the object and also establish an association between them. But at this early stage of word learning, although able to discriminate language-speci# c phonetic di" er-ences, infants might not yet give these di" erences any more weight than they give to other proper-ties of the words, such as the a" ect in the voice, the gender of the speaker, etc. And indeed, there is evidence that in both word recognition and early word-learning tasks, infants can fail to treat as equiv-alent two instances of the same word if they are spo-ken in a di" erent voice, di" erent a" ective tone, etc. (Newman, 2008; Singh, Morgan, & White, 2004; White & Morgan, 2008). Perhaps one reason, then, that minimal pair associative word learning takes up so much of infants’ cognitive resources is that they encode all the properties of the word and try to hold them all in memory. ! e result is a U-shaped devel-opmental curve, with 8-month-olds, who are not yet in the word-learning phase, succeeding in the task by performing a more simple associative discrimi-nation task, 14-month-olds, in transition toward robust word learning, failing, and 17-month-olds, experienced word learners, succeeding again.

    Further con# rmation for the cognitive load hypothesis comes from several recent studies show-ing that if the cognitive load is decreased, infants are more likely to succeed even at 14 months in associating di" erent objects with minimally dif-ferent labels. As Fennell and Werker (2003) have shown, 14-month-olds have no problem noticing a switch from the known word ball to the known word doll . ! is con# rms the suggestion that it is the acquisition of object-label associations that under-lies 14-month-olds’ di4 culty with # ne phonologi-cal information, not a general inability to represent such information.

    Indeed, if the infants are prefamiliarized with the objects at home for several weeks and hence do not need to learn both the words and the objects in the experimental situation, they succeed at using the phonetic details in the words in the switch task (Fennell & Werker, 2004). Infants also succeed if they are o" ered a visual choice at test, hence reducing the memory demands (Yoshida, Fennell, Swingley, & Werker, 2009). And # nally, infants succeed if the phonetic di" erences are more acoustically salient (Curtin, 2010; Curtin, Fennell, & Escudero, 2009), or if they have been taught the words in a task that

    highlights which sound di" erences are important (! iessen, 2007).

    Still unexplained from the cognitive resource lim-itation hypothesis is why infants are able to succeed at 17 months and beyond. We (Curtin & Werker, 2007; Werker & Curtin, 2005) and others (Nazzi & Bertoncini, 2003; Newman, 2008) have suggested that by 17 to 18 months, language-speci# c phonetic sensitivities have become organized into more sta-ble, “phonological” representations that have begun to act much like the phoneme categories adults use to guide word learning and word recognition. With such a stable representation, infants’ attention is weighted toward the phonetic, allowing them to more easily summarize across irrelevant information such as gender and a" ect in the voice, and attend instead to the criteria phonemic di" erence.

    ! ere is increasing research to support this hypothesis. One recent study (Dietrich, Swingley, & Werker, 2007) compared English and Dutch toddlers on their ability to learn the minimal pair words “tam” and “taam” that di" er in only a vowel length distinction. Of importance, vowel length is phonemic (used to contrast meaning between two words) in Dutch but not in English. Although not phonemic, the di" erence between a long and short vowel is so acoustically salient that even English-learning infants and toddlers can discriminate the di" erence in perceptual discrimination tasks that do not involve meaning (Mugitani et al., 2009). However, if stable phonological representations rather than phonetic di" erences guide word learn-ing at 18 months, only Dutch infants should use the vowel length di" erence in a word-learning task. And indeed this is what was found using the switch task (Fig. 31.1). Dutch infants succeeded but English infants failed. As an important control, it was shown that English infants aged 18 months nonetheless succeeded when a vowel di" erence that is phonemic in English was used (“tam” vs. “tem”). In a more recent study it was shown that by the same age, 18 months, English infants will gener-alize across di" erent accents to treat di" erent pro-nunciations of the same vowel as equivalent (Best, Tyler, Gooding, Orlando, & Quann, 2009). ! us, by 18 months, infants use phonological representa-tions, perhaps abstract phoneme categories that are part of the grammar of the native language, rather than simple perceptual sensitivities, to guide word learning.

    How do infants develop these stable phonemic representations? We suggest it is with the establish-ment of a sizeable enough lexicon linking phonetic

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 91631_Zelazo-V1_Ch31.indd 916 9/12/2012 8:42:55 PM9/12/2012 8:42:55 PM

  • 917Werker, Gervain

    categories to word forms that the infant is able to pull out—either through emergent statistical learn-ing or through a more abstract generalization—the knowledge that certain sound di" erences matter more than others (Curtin & Werker, 2007; Werker et al., 2002; Werker & Curtin, 2005).

    ! e above results suggest that the ability to map between labels and objects undergoes considerable reorganization at around 14 months of age. By that time, infants’ native phonetic categories are in place, so labeling seems independent of the estab-lishment of the basic phonemic repertoire of the native language. Indeed, 8-month-olds appear unaf-fected by the presence of objects, and we reviewed earlier how simple statistical learning might be suf-# cient for establishing language-speci# c phonetic categories.

    However, Yeung and Werker (2009) have recently shown that labeling might have a role to play in the establishment of the language-speci# c categories themselves. ! e idea that meaning con-trasts might signal phoneme status (i.e., /p/ and /b/ are distinct phonemes in English, because words like pin and bin contrast in meaning) has been broadly accepted in the linguistic literature. However, this principle was believed to be an unlikely learning mechanism for phoneme con-trasts because infants develop their native phonetic category repertoire before they have a sizeable lexicon. While this observation remains, in gen-eral, true, Yeung and Werker (2009) have shown that the presence of a nameable object can indeed induce the establishment of native-language pho-netic categories in 9-month-old infants, an age at which the native repertoire is already taking shape but universal discrimination hasn’t been lost yet. ! e authors have shown that when a nonnative

    phonetic contrast is presented with labels in a con-sistent manner (i.e., one phone always associated with one object, the other phone with the other object), 9-month-olds were able to discriminate the two. ! ey failed, however, when labeling was inconsistent (i.e., when both phones appeared with both objects), suggesting that the di" erence between the sounds did not indicate a contrast in meaning. ! ese results indicate that labeling itself might in3 uence the kind of speech sound repre-sentations that are being established, implying that speech perception and word learning might be more intimately related, and from the begin-nings of the establishment of the native phonetic repertoire, than previously believed.

    ! e connection between objects and speech also works in the opposite direction. As research by Waxman and Markow (1995) suggests, speech might act as an “invitation to form categories.” As a large body of research has shown (e.g., Fulkerson & Waxman, 2007; Waxman & Markow, 1995), infants treat linguistic labels, but not nonlinguistic sounds, as referring to categories of objects rather than just to the individual object the label was pre-sented with. Infants as young as 6 months of age were able to form the noun category “dinosaur” when presented with visually di" erent instances of dinosaurs paired with naming phrases ( Look at the toma! Do you see the toma? ). If, however, the dino-saurs were paired with tone sequences, no categori-zation occurred.

    ! ese results suggest a strong relation between speech perception and conceptual organization within the domain of word learning, especially during the initial stages when concrete nouns are learned.

    Learning Abstract Regularities: Rules and Perceptual Mechanisms

    A question that has gained signi# cant theoretical importance in the past decade is how speech per-ception contributes to the acquisition of language structure. Above, we reviewed research suggesting that phonological categories likely emerge only in the second year of life, yet these categories do have a foundation in the experience-expectant cat-egories present earlier in perceptual development. When experimental research on language acquisi-tion began about 50 years ago, the accepted view in generative linguistics, as well as other approaches to language acquisition, held that the grammar of a language is largely independent of its phonetics, phonology, and prosody, so speech perception was

    “tam” “tam”

    Same SwitchTest PhaseHabituation Phase

    seco

    nds l

    ooki

    ng“taam”

    10.08.06.04.02.00.0

    familiar switched

    “taam”

    Figure 31.1. Phonological categories guide word learning. Reprinted from Dietrich et al. (2007) with permission.

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 91731_Zelazo-V1_Ch31.indd 917 9/12/2012 8:42:55 PM9/12/2012 8:42:55 PM

  • 918 Speech Perception in Infancy

    not believed to play an important role. ! is view changed in the 1990s, when it was discovered that infants might use the existing correlations between the phonological aspects of language and its struc-ture, well known from language typology, to solve what Pinker (1984) termed the “linking problem.” ! e linking problem means that even if infants are born with experience-expectant mental representa-tions to facilitate language learning (e.g., a universal grammar), these abstract characterizations of lan-guage structure use rules and discrete, abstract cate-gories such as noun, verb, subject, or predicate, and cannot be linked directly to the continuous, physi-cal speech (or sign) input. If, however, infants can somehow exploit the sound–structure correspon-dences that exist in language, they might be able to use them to bootstrap abstract language structure.

    One of the # rst studies that directly addressed whether infants were able to extract underlying regularities from an arti# cial grammar was Gomez and Gerken’s (1999) work. In their study, Gomez and Gerken (1999) tested infants in a version of Reber’s classical arti# cial grammar-learning task (Reber, 1967, 1969), asking whether infants were able to extract regularities and whether they were able to generalize them (i.e., transfer them to a new vocabulary). ! e authors found that 1-year-old infants successfully discriminated grammatical from ungrammatical strings in several di" erent experi-mental conditions, including those that required generalization by implementing the regularities in a new vocabulary at test.

    Marcus, Vijayan, Rao, and Vishton (1999) used a di" erent paradigm to show that 7-month-old infants were able to learn and generalize abstract rules containing identity relations (e.g., “wo fe fe,” [ABB] or “wo fe wo” [ABA]), taking this as evidence for infants’ ability to represent abstract variables and algebraic rules.

    Since the auditory system is mature from about 24 to 28 weeks of gestation, sensitivity to identity relations and repetitions might be operational early on. Indeed, a brain imaging study with newborn infants (Gervain et al., 2008) found that the left temporal and frontal areas responded di" erentially to syllable sequences containing adjacent repetitions (“mubaba,” “penana”: ABB) as opposed to random sequences (“mubage,” “penaku”: ABC) from the very # rst trials of the study, suggesting that an automatic repetition-detecting mechanism might be at work (Fig. 31.2). ! is di" erential response increased over the time course of the experiment, especially in the prefrontal brain areas, indicating that some form

    of learning or memory-trace formation might take place. Interestingly, no such di" erential response was found when nonadjacent repetitions (ABA) were pitted against random ABC sequences.

    ! ese results suggest that newborns have expe-rience-expectant perceptual processing mechanisms that allow them to detect and represent structural regularities in speech from the very beginning of linguistic development.

    Learning Language-Specifi c Grammatical Structures: ! e Case of Word Order

    Another perceptual mechanism that has been proposed to play a role in the acquisition of language structure is the auditory grouping principle known as the Iambic/Trochaic Law, originally proposed for music and nonlinguistic sounds. According to this law, sound sequences in which the elements contrast in intensity and/or pitch are perceived as forming units with initial prominence (trochees), while ele-ments contrasting in duration form units with # nal prominence (iambs) (Hayes, 1995). ! is grouping principle has been claimed to help infants boot-strap word order (Nespor, Guasti, & Christophe, 1996; Nespor & Vogel, 1986), as there is a correla-tion between the type of grouping a language uses and its word order. Object–Verb (OV) languages, such as Japanese, Turkish, Basque, etc., have initial prominence in their phonological phrases, and it is realized as increased pitch and intensity, whereas VO languages like English, French, etc., imple-ment phrase-# nal prominence using lengthening. ! is has been shown (Nespor et al., 2008) to hold true both across languages (e.g., French vs. Turkish) and within a language that has phrases with mixed word orders (e.g., German and Dutch, where the verb phrase [VP] shows OV or VO order depend-ing on the syntactic context). ! e reason why this grouping principle is believed to be a particularly useful bootstrapping cue is because it is carried by universal acoustic cues (intensity/pitch vs. duration) automatically processed by the auditory system. Consequently, it might be operational from early on (Nespor et al., 1996).

    Current research on the Iambic/Trochaic Law has focused on two issues: (1) whether the law oper-ates at the level of the word, the phrase, or both, and (2) whether the law is inherent to the auditory system or whether it emerges as a consequence of experience with the native language(s).

    ! e original formulation of the principle claimed that it is independent of language and applies to speech and nonspeech sounds alike (for summary,

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 91831_Zelazo-V1_Ch31.indd 918 9/12/2012 8:42:55 PM9/12/2012 8:42:55 PM

  • 919Werker, Gervain

    see Hayes, 1995). Indeed, a study by Hay and Diehl (2007) found no di" erence in grouping preference between English- and French-speaking adults despite the fact that English has a predominantly trochaic word-level stress pattern, whereas French is iambic. In this experiment, the authors exposed participants to sequences of syllables or sine wave segments (i.e., speech or nonspeech) that alternated either in inten-sity or duration, and asked participants to indicate whether they heard a strong–weak (trochaic) or a weak–strong (iambic) pattern. Both English and French participants heard a strong–weak pattern when intensity was manipulated and a weak–strong pattern when the contrast was durational. ! e authors concluded that the Iambic/Trochaic Law

    was a general auditory principle. More recently, however, Iversen, Patel, and Ohgushi (2008) tested English and Japanese participants and found that while English speakers had the predicted grouping preferences for both the intensity contrast (trochaic) and the durational contrast (iambic), Japanese par-ticipants only showed a predicted trochaic contrast for the intensity-manipulated tone sequences but had no preference when the manipulation involved duration. Further analyzing their data, the authors showed that this null preference is due to a bimodal pattern of responses in the Japanese group. One sub-set of the Japanese participants had an unpredicted long–short preference. ! e authors concluded that this re3 ects a language-speci# c bias, indicating that

    ABB 1(A)

    (C)

    (B)

    ABB15 s 0.5 s 1.5 s 0.5 s 0.5 s 1.5 s 1.5 s 1.5 s 1.5 s 0.5 s

    CDD EFF GHH IJJ KLL MNN OPP QRR

    RHLH

    superior

    anterior

    superior

    anterior

    Channel 2

    Channel 4

    Channel 7

    Channel 9

    Channel 10

    Channel 12

    Channel 11 Channel 24

    Channel 21

    Channel 23

    Channel 20

    Channel 18

    Channel 16

    Channel 19 Channel 14

    Channel 17Channel 22

    ABB Oxy ABB Deoxy

    0.10.05

    0–0.05–0.1

    10 20 30 40

    Channel 13

    Channel 15

    Channel 6Channel 1

    Channel 3 Channel 8

    STT

    ABC 1 ABB 2 ABB 3silence silence silence

    [10 sentences]

    silence

    18 s 35 s 18 s 18 s 18 s25 s 25 s 35 s

    [2 x 14 blocks]

    [22 min]

    0.10.05

    –0.05–0.1

    0

    10 20

    ABC OxyLH RHABC Deoxy30 40

    Channel 5

    Figure 31.2. Newborns can discriminate adjacent repetitions (ABB) from random speech sequences (ABC). A . Experimental pro-cedure. B . Probe placement. C . Hemodynamic responses obtained for the ABB and ABC stimuli (plotting follows probe placement illustrated in B ). Reproduced from Gervain et al. (2008) with permission.

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 91931_Zelazo-V1_Ch31.indd 919 9/12/2012 8:42:56 PM9/12/2012 8:42:56 PM

  • 920 Speech Perception in Infancy

    the Iambic/Trochaic Law emerges as a result of lan-guage experience.

    ! e above-discussed studies seem to point in di" erent directions, leaving the question about the origins of the grouping principle unanswered. However, two issues need to be considered. First, the two language pairs di" er in the prosodic unit involved in the comparison. English and French have di" erent rhythmic patterns at the word level, but both have VO (i.e., iambic rhythm) at the phrasal level. Japanese and English, by contrast, di" er at the phrasal level, since Japanese has an OV-type (i.e., trochaic) prosody. Second, Iversen and colleagues (2008) found a bimodal response pattern, which means that one subset of their participants had an unpredicted preference for the durational contrast, but the other, though smaller, subset showed the expected short–long grouping preference. Taken together, the two adult experiments, as well as the above considerations, might be best understood as indicating that just as the iambic/trochaic law pre-dicts, there is a universal and language-independent grouping preference assigning initial prominence to intensity contrasts and a # nal prominence to durational contrasts, as suggested by Iversen and colleagues’ (2008) English participants and the smaller subset of the Japanese participants. In addi-tion to this language-independent perceptual bias, there might exist another, language-based grouping principle that mirrors the rhythmic patterns found in the native language, as Iversen and colleagues’ (2008) larger Japanese subgroup suggests. In this context, Hay and Diehl’s (2007) # ndings might further con# rm the existence of a universal bias, or alternatively, they can be taken as evidence that if cross-linguistic grouping preferences di" er, the rel-evant level of analysis is that of the phonological phrase, where English and French are similar, not that of the word, where these two languages di" er.

    ! e above hypothesis can be tested developmen-tally, as the inherent and the emergent views on the Iambic/Trochaic Law make di" erent develop-mental predictions: universal, experience-expectant perceptual principles can be expected to appear earlier in development than experience-dependent language-speci# c biases. To date, only a handful of studies have been conducted with infants, and the results are not conclusive. ! e position of phrasal prominence as a cue was # rst tested with infants in the context of language acquisition by Christophe, Nespor, Guasti, and Van Ooyen (2003), who showed that 6- to 12-week-old infants were able to discriminate two languages that have opposite word

    orders and opposite prosodic patterns (French and Turkish), solely on the basis of the di" erent loca-tions of prosodic prominence in the two languages. ! is was achieved by resynthesizing natural utter-ances in the two languages, suppressing segmental identity and all phonological properties other than the location of the prosodic prominence. ! is study demonstrates that even very young infants can dis-criminate the relevant prosodic properties, although it doesn’t address the question of the origin of the principle directly.

    ! is issue was recently investigated by a num-ber of studies that arrived at con3 icting results. Yoshida and colleagues (2010b) exposed English and Japanese infants to alternating sequences of short and long tones. At age 5 to 6 months, neither group showed a preference, but by 7 to 8 months, the English babies exhibited longer looking times to long–short test sequences as opposed to short–long ones, which can be interpreted as a novelty preference. Crucially, the Japanese infants showed the opposite pattern, although their preference was weaker than that of the English-speaking babies. Intensity was not tested in this study, but the results suggest that at least an iambic grouping preference for duration might emerge as a result of experience with language.

    Hay and Sa" ran (2008) found similar results with regard to duration in a series of experiments testing how statistical learning (Sa" ran et al., 1996) and the Iambic/Trochaic Law might interact. ! ey observed that 6-month-old English infants showed no preference when durational cues were pitted against statistical cues, whereas 9-month-olds relied more heavily on durational cues than statistical cues, irrespectively of whether syllables or tones were used as stimuli. A very di" erent developmental trajectory emerged, however, for intensity as a cue. Both 6- and 9-month-old English babies were able to use intensity as a cue to word onsets in the segmenta-tion task.

    Similarly, Bion and colleagues (2011) also found that pitch contrasts resulted in the predicted (high–low) grouping of simple syllables in 7-month-old infants, while durational contrasts did not induce any grouping. Unlike infants, the Italian-speaking adults in this study showed both the predicted (high–low) grouping when syllables of alternating pitch were used, as well as the predicted (short-long) grouping when syllables alternated in duration.

    Surprisingly, other infant studies found the opposite pattern of results. In Trainor and Adams’s (2000) study, 8-month-olds as well as adults treated

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 92031_Zelazo-V1_Ch31.indd 920 9/12/2012 8:42:57 PM9/12/2012 8:42:57 PM

  • 921Werker, Gervain

    lengthening as a cue to the end of groups of tones, whereas intensity did not result in systematic group-ing at either age.

    Interestingly, in music perception longer duration and lower pitch have been shown to reliably indi-cate the # nal boundaries of tonal groups in infants as young as 4.5 months (Jusczyk & Krumhansl, 1993). ! is was true even when other factors, such as the presence or absence of a large durational or pitch discontinuity across boundaries, were con-trolled for. In this study, however, pitch and dura-tion were used conjointly as cues to boundaries.

    In an experiment that directly addressed the role of the grouping bias as a cue to word order, Gervain and Werker (2009; under revision) found evidence that 7-month-old bilingual infants exposed to one VO (English) and one OV (Japanese, Hindi, Punjabi, Farsi, or Korean) language were able to use pitch and duration as cues to word order. When exposed to an arti# cial grammar with VO (iambic) prosody, realized as a durational contrast, bilingual infants preferred test items with VO word order over test items with the opposite word-order pat-tern. When pitch (i.e., OV prosody) was used dur-ing familiarization, the opposite pattern was found: infants showed a preference for the OV order.

    Taken together, these studies suggest that at around 7 to 9 months, language-speci# c grouping biases emerge, as suggested by our hypothesis above. However, results are relatively weak, and studies are inconsistent as to which of the two cues is more reli-able. It seems clear, though, that at a younger age, roughly between 5 and 7 months, grouping prefer-ences seems to be absent. ! is appears to contra-dict the # rst tenet of our interpretation of the adult data, namely that there exists an initial and univer-sal grouping principle. However, the youngest ages from 0 to 4 months have not been tested in any of the studies. ! us, it is not impossible that the iam-bic/trochaic grouping principle follows a U-shaped developmental trajectory, with an initial universal bias that disappears before 5 months of age, and a new grouping principle that reemerges, partly as a result of language experience. Further research into newborns’ rhythmic grouping abilities is needed to clarify this issue.

    Conclusion Language acquisition begins long before the # rst

    word. In this chapter we have provided an overview of recent research on infant speech perception—initial biases, mechanisms of change, and the rela-tion between perception and language use—as a

    means to better understand the ontogeny of human language. We have tried throughout to distinguish between those initial experience-expectant percep-tual sensitivities and representations that are com-mon to all human infants, irrespective of the precise prenatal or early experience encountered, and those experience-dependent sensitivities and representa-tions that re3 ect the operation of perceptual tuning or learning. We argue that all initial biases, whether common to the species or in3 uenced by prenatal listening experience, re3 ect an epigenetic process wherein endogenous (factors within the organism, such as trophic factors, the intercellular matrix, etc.) or exogenous (factors in the world that a" ect the developing child through ingestion, perception, etc.) variables in3 uence the expression of genetic potential.

    Our review makes clear that infants begin life with an epigenetically determined set of linguistic biases that form a foundation on which speci# c lan-guage experience can build. Within this constrained set, some language-speci# c tuning appears to require only limited exposure during a narrow window in development, and is then relatively permanently set (experience-expectant). Other patterns of change appear to require more exposure in order to change, and appear more open to continued learning even beyond the infancy period (experience-dependent). Enormous progress has been made on characteriz-ing each of these, thus advancing considerably our understanding of the initial foundations that are in place to serve language acquisition.

    What remains more controversial is just how changing perceptual sensitivities and early represen-tations bootstrap language acquisition. Are the ini-tial perceptual primitives the # rst bits of structure, which, when tuned by experience with the native language, constitute actual linguistic representations that contribute to an abstract linguistic system? Or are the initial sensitivities more broadly shared across perceptual systems, and only later usurped by the linguistic system for speci# c uses? ! e literature reviewed provides support for each. For example, in the realm of phonetic perception there seem to be initial sensitivities that set the range of phonetic distinctions that can be learned. According to one strand of data, distributional (statistical) learning leads to the emergence of language-speci# c percep-tual categories that only after linking to meaning lead to a second-order generalization of phoneme categories that can then be used to direct word learning. Yet, according to another strand of data, learning about native-language phonetic categories

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 92131_Zelazo-V1_Ch31.indd 921 9/12/2012 8:42:57 PM9/12/2012 8:42:57 PM

  • 922 Speech Perception in Infancy

    unfolds via acquired distinctiveness, wherein the pairing of sounds and objects, even before word learning, helps establish the category structure of each. Similarly, in the case of the iambic/trochaic law, there are di" erent results from di" erent studies. ! is suggests either a very general perceptual sensi-tivity to pitch or loudness in stress-initial groupings and to duration in stress-# nal position, which must then be speci# ed for language use. Alternatively, it might suggest a di" erent unfolding for these sensi-tivities within and outside the domain of linguis-tic stimuli. ! e challenge of future work is to tease apart these apparent di" erences, carefully investi-gating the generality, speci# city, and utilization of initial and emergent representations. Only in this way can we fully understand not only when, but how, the changing perceptual system supports and directs the acquisition of language.

    Questions for Future Research 1. How speci# c are initial perceptual biases

    and abilities to language? Are they part of the language faculty or are they shared across di" erent perceptual and cognitive abilities?

    2. How do these perceptual and learning processes interact with experience in multilingual environments, where the input contains several languages?

    3. How is language organized in the newborn and infant brain? Does the attunement process from broad to narrower speech-perception abilities accompanied by functional reorganization in the infant brain?

    4. How does the conceptual system interact with language acquisition (e.g., in word learning or early sentence processing)?

    5. Is the prosodic bootstrapping mechanism contributing to the acquisition of word order a language-independent general auditory bias or is it the result of experience with the native language?

    Notes 1 . ! is work was supported by an ANR Jeunes Chercheurs

    Jeunes Chercheuses grant (nr 21373) to JG. 2 . TP(A->B) = P(AB)/P(A), where TP(A->B) is the tran-

    sition probability between adjacent units A and B, P(X) is the probability of occurrence of unit X.

    References Baker, S. A., Golinko" , R. M., & Petitto, L. A. (2006). New

    insights into old puzzles from infants’ categorical discrimi-nation of silent phonetic units. Language Learning and Development, 2 (3), 147–162.

    Best, C. T., & McRoberts, G. W. (2003). Infant perception of non-native consonant contrasts that adults assimilate in dif-ferent ways. Language and Speech, 46 (2), 183–216.

    Best, C. T., Tyler, M. D., Gooding, T. N., Orlando, C. B., & Quann, C. A. (2009). Development of phonological con-stancy: Toddlers’ perception of native- and Jamaican-accented words. Psychological Science, 20 (5), 539–542.

    Bion, R., Benavides, S., & Nespor, M. (2011). Acoustic mark-ers of prominence in3 uence infants’ and adults’ memory for speech sequences. Language and Speech, 54, 123–140.

    Bonatti, L. L., Pe ñ a, M., Nespor, M., & Mehler, J. (2005). Linguistic constraints on statistical computations: ! e role of consonants and vowels in continuous speech processing. Psychological Science, 16 (6), 451–459.

    Bosch, L., & Sebasti á n Gall é s, N. (2001). Early language di" eren-tiation in bilingual infants. In J. Cenoz & F. Genesee (Eds.), Trends in bilingual acquisition (pp. 71–93). Amsterdam, Netherlands: Benjamins.

    Bosch, L., & Sebasti á n-Gall é s, N. (1997). Native-language rec-ognition abilities in 4-month-old infants from monolingual and bilingual environments. Cognition, 65 (1), 33–69.

    Bosch, L., & Sebasti á n-Gall é s, N. (2003). Simultaneous bilin-gualism and the perception of a language-speci# c vowel contrast in the # rst year of life. Language and Speech, 46 (2), 217–243.

    Burns, T. C., Yoshida, K. A., Hill, K., & Werker, J. F. (2007). ! e development of phonetic representation in bilingual and monolingual infants. Applied Psycholinguistics, 28 (3), 455–474.

    Byers-Heinlein, K., Burns, T. C., & Werker, J. F. (2010). ! e roots of bilingualism in newborns. Psychological Science , 21 (3), 343–348.

    Christophe, A., Dupoux, E., Bertoncini, J., & Mehler, J. (1994). Do infants perceive word boundaries? an empirical study of the bootstrapping of lexical acquisition. Journal of the Acoustical Society of America, 95 (3), 1570–1580.

    Christophe, A., Nespor, M., Guasti, M. T., & Van Ooyen, B. (2003). Prosodic structure and syntactic acquisition: ! e case of the head-direction parameter. Developmental Science, 6 (2), 211–220.

    Church, K. W. (1987). Phonological parsing and lexical retrieval. Cognition, 25 , 53–69.

    Curtin, S. (2010). Twelve-month-olds learn novel word-object pairings di" ering only in stress pattern. Journal of Child Language, 105, 376–385.

    Curtin, S., & Werker, J. F. (2007). ! e perceptual foundation of phonological development. In G. Gaskell (Ed.), " e Oxford handbook of psycholinguistics (pp. 579–599). Oxford: Oxford University Press.

    Curtin, S., Fennell, C. T., & Escudero, P. (2009). Weighting of acoustic cues explains patterns of word-object associative learning. Developmental Science, 12, 725–731.

    Curtin, S., Mintz, T. H., & Byrd, D. (2001). Coarticulatory cues enhance infants’ recognition of syllable sequences in speech. In A. H. J. Do, L. Dominguez & A. Johansen (Eds.), Proceedings of the 25th annual Boston University Conference on Language Development (pp. 190–201). Somerville, MA: Cascadilla Press.

    Cutler, A. (1994). Segmentation problems, rhythmic solutions. Lingua, 92 , 81–104.

    Cutler, A., & Carter, D. M. (1987). ! e predominance of strong initial syllables in the English vocabulary. Computer Speech and Language, 2 , 133–142.

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 92231_Zelazo-V1_Ch31.indd 922 9/12/2012 8:42:57 PM9/12/2012 8:42:57 PM

  • 923Werker, Gervain

    DeCasper, A. J., & Fifer, W. P. (1980). Of human bonding: Newborns prefer their mothers’ voices. Science, 208 (4448), 1174–1176.

    Dehaene-Lambertz, G., & Gliga, T. (2004). Common neural basis for phoneme processing in infants and adults. Journal of Cognitive Neuroscience, 16 (8), 1375–1387.

    Dehaene-Lambertz, G., Dehaene, S., & Hertz-Pannier, L. (2002). Functional neuroimaging of speech perception in infants. Science, 298 (5600), 2013–2015.

    Dehaene-Lambertz, G., Hertz-Pannier, L., Dubois, J., & Dehaene, S. (2008). How does early brain organization promote language acquisition in humans? European Review, 16 (4), 399–411.

    Dietrich, C., Swingley, D., & Werker, J. F. (2007). Native lan-guage governs interpretation of salient speech sound di" er-ences at 18 months. Proceedings of the National Academy of Sciences USA, 104 (41), 16027–16031.

    Eimas, P. D., Siqueland, E. R., Jusczyk, P. W., & Vigorito, J. (1971). Speech perception in infants. Science, 171 (968), 303–306.

    Endress, A. D., & Bonatti, L. L. (2007). Rapid learning of syl-lable classes from a perceptually continuous speech stream. Cognition, 105 (2), 247–299.

    Fennell, C. T., & Werker, J. F. (2003). Early word learners’ ability to access phonetic detail in well-known words. Language and Speech, 46 (2), 245–264.

    Fennell, C. T., & Werker, J. F. (2004). Infant attention to pho-netic detail: Knowledge and familiarity e" ects. Proceedings of the 28th Annual Boston University Conference on Language Development (pp. 165–176).

    Fiser, J., & Aslin, R. N. (2002). Statistical learning of new visual feature combinations by infants. Proceedings of the National Academy of Sciences USA, 99 (24), 15822–15826.

    Fiser, J., & Aslin, R. N. (2005). Encoding multielement scenes: Statistical learning of visual feature hierarchies. Journal of Experimental Psychology, General, 134 (4), 521–537.

    Fulkerson, A. L., & Waxman, S. R. (2007). Words (but not tones) facilitate object categorization: Evidence from 6- and 12-month-olds. Cognition, 105 (1), 218–228.

    Gervain, J., & Werker, J. (2009). Frequency and prosody bootstrap word order: A cross-linguistic study with 7-month-old infants . Poster presented at SRCD 2009 Biennial Meeting, April 2–4, 2009, Denver.

    Gervain, J., & Werker, J. F. (under revision). Prosody cues word order in 7-month-old bilingual infants.

    Gervain, J., & Werker, J. F. (2008). How infant speech per-ception contributes to language acquisition. Language and Linguistics Compass, 2 , 1149–1170.

    Gervain, J., Macagno, F., Cogoi, S., Pe ñ a, M., & Mehler, J. (2008). ! e neonate brain detects speech structure. Proceedings of the National Academy of Sciences USA, 105 (37), 14222–14227.

    Gomez, R. L., & Gerken, L. (1999). Arti# cial grammar learn-ing by 1-year-olds leads to speci# c and abstract knowledge. Cognition, 70 (2), 109–135.

    Gottlieb, G. (1976). ! e roles of experience in the development of behavior and the nervous system. In G. Gottlieb (Ed.), Neural and behavioral speci! city (pp. 25–53). New York: Academic Press.

    Greenough, W. T., Black, J. E., & Wallace, C. S. (1987). Experience and brain development. Child Development, 58 (3), 539–559.

    Harris, Z. (1955). From phoneme to morpheme. Language, 31 , 190–222.

    Hauser, M. D., Newport, E. L., & Aslin, R. N. (2001). Segmentation of the speech stream in a non-human primate: Statistical learning in cotton-top tamarins. Cognition, 78 (3), B53–B64.

    Hay, J. S., & Diehl, R. L. (2007). Perception of rhythmic grouping: Testing the iambic/trochaic law. Perception & Psychophysics, 69 (1), 113–122.

    Hay, J., & Sa" ran, J. (2008). Perceptual constraints on segmen-tation of speech and non-speech . 33rd BUCLD, October 31–November 2, 2008, Boston, MA.

    Hayes, B. (1995). Metrical stress theory: Principles and case studies . Chicago: University of Chicago Press.

    Hayes, J. R., & Clark, H. H. (1970). Experiments in the seg-mentation of an arti# cial speech analogue. In J. R. Hayes (Ed.), Cognition and the development of language (). New York: Wiley.

    Hohne, E. A., & Jusczyk, P. W. (1994). Two-month-old infants’ sensitivity to allophonic di" erences. Perception and Psychophysics, 56 , 613–623.

    Iversen, J. R., Patel, A. D., & Ohgushi, K. (2008). Perception of rhythmic grouping depends on auditory experi-ence. Journal of the Acoustical Society of America, 124 (4), 2263–2271.

    Jusczyk, P. W., & Aslin, R. N. (1993). Recognition of familiar patterns in 3 uent speech by 7–1/2 month old infants (pp. 1–11).

    Jusczyk, P. W., & Krumhansl, C. L. (1993). Pitch and rhythmic patterns a" ecting infants’ sensitivity to musical phrase struc-ture. Journal of Experimental Psychology, Human Perception and Performance, 19 (3), 627–640.

    Jusczyk, P. W., Hohne, E. A., & Bauman, A. (1999). Infants’ sen-sitivity to allophonic cues to word segmentation. Perception & Psychophysics, 61 , 1465–1476.

    Jusczyk, P. W., Houston, D. M., & Newsome, M. R. (1999). ! e beginnings of word segmentation in English-learning infants. Cognitive Psychology, 39 (3), 159–207.

    Kuhl, P. K. (1979). Speech perception in early infancy: Perceptual constancy for spectrally dissimilar vowel categories. Journal of the Acoustical Society of America, 66 (6), 1668–1679.

    Kuhl, P. K. (1980). Perceptual constancy for speech-sound categories in early infancy. In G. H. Yeni-Komshian, J. F. Kavanagh, & C. A. Ferguson (Eds.), Child phonology: Perception (pp. 41–66). New York: Academic Press.

    Kuhl, P. K. (2004). Early language acquisition: Cracking the speech code. Nature Reviews Neuroscience, 5 (11), 831–843.

    Kuhl, P. K., Stevens, E., Hayashi, A., Deguchi, T., Kiritani, S., & Iverson, P. (2006). Infants show a facilitation e" ect for native language phonetic perception between 6 and 12 months. Developmental Science, 9 (2), F13–f21.

    Kuhl, P. K., Tsao, F., & Liu, H. (2003). Foreign-language expe-rience in infancy: E" ects of short-term exposure and social interaction on phonetic learning. Proceedings of the National Academy of Sciences USA, 100 (15), 9096–9101.

    Kuhl, P. K., Williams, K. A., Lacerda, F., Stevens, K. N., & Lindblom, B. (1992). Linguistic experience alters phonetic perception in infants by 6 months of age. Science, 255 (5044), 606–608.

    Marchetto, E., & Bonatti, L. L. (under revision). Finding words and rules in a speech stream at 7 and 12 months.

    Marcus, G. F., Vijayan, S., Rao, S. B., & Vishton, P. M. (1999). Rule learning by seven-month-old infants. Science, 283 (5398), 77–80.

    AQ: Please provide

    page range for the

    reference

    AQ: Please provide

    complete details for the

    reference

    AQ: Please update the reference if it has been published.

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 92331_Zelazo-V1_Ch31.indd 923 9/12/2012 8:42:57 PM9/12/2012 8:42:57 PM

    judit

  • 924 Speech Perception in Infancy

    Mattys, S. L., & Jusczyk, P. W. (2001a). Do infants seg-ment words or recurring contiguous patterns? Journal of Experimental Psychology: Human Perception and Performance, 27 (3), 644–655.

    Mattys, S. L., & Jusczyk, P. W. (2001b). Phonotactic cues for segmentation of 3 uent speech by infants. Cognition, 78 (2), 91–121.

    Mattys, S. L., Jusczyk, P. W., Luce, P. A., & Morgan, J. L. (1999). Phonotactic and prosodic e" ects on word segmentation in infants. Cognitive Psychology, 38 (4), 465–494.

    May, L., Byers-Heinlein, K., Gervain, J., & Werker, J. F. (2011). Language and the newborn brain: does prenatal language experience shape the neonate neural response to speech? Frontiers in Psychology , 222, 22.

    May, L., Gervain, J., Carreiras, M., & Werker, J. F. (2011). Neural specialization for speech at birth: Comparing native and non-native language. Poster presented at the Neurobiology of Language Conference, Baltimore, Maryland, November 2011.

    Maye, J., & Gerken, L. (2001). Learning phonemes: How far can the input take us? Proceedings of the Annual Boston University Conference on Language Development, 25 (2), 480–490.

    Maye, J., Weiss, D., & Aslin, R. N. (2008). Statistical phonetic learning in infants: Facilitation and feature generalization. Developmental Science, 11 , 122–134.

    Maye, J., Werker, J. F., & Gerken, L. (2002). Infant sensitivity to distributional information can a" ect phonetic discrimina-tion. Cognition, 82 (3), B101–B111.

    Mehler, J., Dommergues, J., Frauenfelder, U. H., & Segui., J. (1981). ! e syllable’s role in speech segmentation. Cognitive Psychology, 18 , 1–86.

    Mehler, J., Jusczyk, P. W., Lambertz, G., Halsted, N., Bertoncini, J., & Amiel-Tison, C. (1988). A precursor of language acquisi-tion in young infants. Cognition, 29 , 143–178.

    Moon, C., Cooper, R. P., & Fifer, W. P. (1993). Two-day-olds prefer their native language. Infant Behavior and Development, 16 (4), 495–500.

    Morgan, J. L. (1996). A rhythmic bias in preverbal speech segmentation. Journal of Memory and Language, 35 (5), 666–688.

    Morgan, J. L., & Sa" ran, J. R. (1995). Emerging integration of sequential and suprasegmental information in preverbal speech segmentation. Child Development, 66 , 911–936.

    Morgan, J. L., Shi, R., & Allopenna, P. (1996). Perceptual bases of rudimentary grammatical categories: Toward a broader conceptualization of bootstrapping. In J. L. Morgan & K. Demuth (Eds.), Signal to syntax (pp. 263–283). Hillsdale, NJ: Lawrence Erlbaum Associates.

    Mugitani, R., Pons, F., Fais, L., Dietrich, C., Werker, J. F., & Amano, S. (2009). Perception of vowel length by Japanese- and English-learning infants. Developmental Psychology, 45 (1), 236–247.

    Narayan, C. R., Werker, J. F., & Beddor, P. S. (2010). Acoustic salience a" ects speech perception in infancy: Evidence from nasal place discrimination. Developmental Science, 13 (3), 407–420.

    Nazzi, T., & Bertoncini, J. (2003). Before and after the vocabu-lary spurt: Two modes of word acquisition? Developmental Science, 6 (2), 136–142.

    Nazzi, T., Bertoncini, J., & Mehler, J. (1998). Language dis-crimination by newborns: Toward an understanding of the role of rhythm. Journal of Experimental Psychology: Human Perception and Performance, 24 (3), 756–766.

    Nazzi, T., Paterson, S. J., & Karmilo" -Smith, A. (2003). Word segmentation by infants with Williams syndrome: Use of prosodic and statistical cues. Infancy, 4 (2), 251–271.

    Nespor, M., & Vogel, I. (1986). Prosodic phonology . Dordrecht, Holland; Riverton, NJ: Foris.

    Nespor, M., Guasti, M. T., & Christophe, A. (1996). Selecting word order: ! e rhythmic activation principle. In U. Kleinhenz (Ed.), Interfaces in phonology (pp. 1–26) Akademie Verlag.

    Nespor, M., Shukla, M., van de Vijver, R., Avesani, C., Schraudolf, H., & Donati, C. (2008). Di" erent phrasal prominence realization in VO and OV languages. Lingue e Linguaggio, 7 (2), 1–28.

    Newman, R. S. (2008). ! e level of detail in infants’ word learning. Current Directions in Psychological Science, 17 (3), 229–232.

    Newport, E. L., & Aslin, R. N. (2004). Learning at a distance, I. Statistical learning of non-adjacent dependencies. Cognitive Psychology, 48 (2), 127–162.

    Newport, E. L., Hauser, M. D., Spaepen, G., & Aslin, R. N. (2004). Learning at a distance, II. Statistical learning of non-adjacent dependencies in a non-human primate. Cognitive Psychology, 49 (2), 85–117.

    Palmer, S. B, Fais, L., Golinko" , R. M., & Werker, J. F. (2012). Perceptual narrowing of linguistic sign occurs in the # rst year of life. Child Development .

    Pe ñ a, M., Bonatti, L. L., Nespor, M., & Mehler, J. (2002). Signal-driven computations in speech processing. Science, 298 (5593), 604–607.

    Pe ñ a, M., Maki, A., Kovacic, D., Dehaene-Lambertz, G., Koizumi, H., Bouquet, F., et al. (2003). Sounds and silence: An optical topography study of language recognition at birth. Proceedings of the National Academy of Sciences USA, 100 (20), 11702–11705.

    Pinker, S. (1984). Language learnability and language develop-ment . Cambridge, MA: Harvard University Press.

    Polka, L., & Werker, J. F. (1994). Developmental changes in per-ception of nonnative vowel contrasts. Journal of Experimental Psychology: Human Perception and Performance, 20 (2), 421–435.

    Quine, W. V. (1960). Word and object . Cambridge, MA: MIT Press.

    Ramus, F. (2002). Language discrimination by newborns: Teasing apart phonotactic, rhythmic, and intonational cues. Annual Review of Language Acquisition, 2 , 85–115.

    Ramus, F., & Mehler, J. (1999). Language identi# cation with suprasegmental cues: A study based on speech resynthe-sis. Journal of the Acoustical Society of America, 105 (1), 512–521.

    Ramus, F., Hauser, M. D., Miller, C., Morris, D., & Mehler, J. (2000). Language discrimination by human newborns and by cotton-top tamarin monkeys. Science, 288 (5464), 349–351.

    Reber, A. S. (1967). Implicit learning of arti# cial grammars. Journal of Verbal Learning and Verbal Behavior, 6 , 855–863.

    Reber, A. S. (1969). Transfer of syntactic structure in synthetic languages. Journal of Experimental Psychology, 81 , 115–119.

    Sa" ran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274 (5294), 1926–1928.

    Sa" ran, J. R., & ! iessen, E. D. (2003). Pattern induction by infant language learners. Developmental Psychology, 39 (3), 484–494.

    AQ: Author,

    please give volume # and page

    range.

    OUP UNCORRECTED PROOF – FIRSTPROOFS, Mon Sep 10 2012, NEWGEN

    31_Zelazo-V1_Ch31.indd 92431_Zelazo-V1_Ch31.indd 924 9/12/2012 8:42:57 PM9/12/2012 8:42:57 PM

    judit

  • 925Werker, Gervain

    Sa" ran, J. R., & Wilson, D. P. (2003). From syllables to syn-tax: Multilevel statistical learning by 12-month-old infants. Infancy, 4 (2), 273–284.

    Sansavini, A., Bertoncini, J., & Giovanelli, G. (1997). Newborns discriminate the rhythm of multisyllabic stressed words. Developmental Psychology, 33 (1), 3–11.

    Sebasti á n Gall é s, N., & Bosch, L. (2009). Developmental shift in the discrimination of vowel contrasts in bilingual: Is the dis-tributional account all there is to it? Developmental Science, 12 (6), 874–887.

    Segui, J., Dupoux, E., & Mehler, J. (1992). ! e role of syllable in speech segmentation, phoneme identi# cation, and lexical access. In G. Altmann (Ed.), Cognitive models of speech pro-cessing: Psycholinguistic and computational perspectives (pp. 263–280). Cambridge, MA: Bradford Books.

    Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27 , 379–423 and 623–656.

    Shi, R., Werker, J. F., & Morgan, J. L. (1999). Newborn infants’ sensitivity to perceptual cues to lexical and grammatical words. Cognition, 72 (2), B11–B21.

    Singh, L., Morgan, J. L., & White, K. S. (2004). Preference and processing: ! e role of speech a" ect in early spoken word rec-ognition. Journal of Memory and Language, 51 (2), 173–189.

    Stager, C. L., & Werker, J. F. (1997). Infants listen for more pho-netic detail in speech perception than in word-learning tasks. Nature, 388 (6640), 381–382.

    Sun, T., Collura, R. V., Ruvolo, M., & Walsh, C. A. (2006). Genomic and evolutionary analyses of asymmetrically expressed genes in human fetal left and right cerebral cortex. Cerebral Cortex, 16 Suppl 1 , i18–25.

    Sundara, M., Polka, L., & Genesee, F. (2006). Language-experience facilitates discrimination of / ð / in monolingual and bilingual acquisition of English. Cognition, 100 (2), 369–388.

    Sundara, M., Polka, L., & Moln á r, M. (2008). Development of coronal stop perception: Bilingual infants keep pace with their monolingual peers. Cognition, 108 (1), 232–242.

    ! iessen, E. D. (2007). ! e e" ect of distributional information on children’s use of phonemic contrasts. Journal of Memory & Language, 56 (1), 16–34.

    ! iessen, E. D., & Sa" ran, J. R. (2003). When cues collide: Use of stress and statistical cues to word bound