19
CHAPTER ONE COGNITION AND SECOND LANGUAGE ACQUISITION 1.1 An emergentist view of cognition Emergentism in science is a wide framework for thinking about different areas of knowledge, such as physics, geology, biology, and cognition, to name a few. According to MacWhinney, “emergentist accounts emphasize complex interactions between multiple factors across multiple time scales” (2006, 731). The phrases “multiple interactions” and “multiple time scales” incorporate the core features of this approach to cognition and language acquisition. The idea of multiple interactions involves a continuum between micro and macro structures and mechanisms, ranging from interactions at the gene level to interactions between the learner and the environment surrounding him. It also indicates a strong reliance on domain general mechanisms in human cognition that permeate the emergent patterns of learning and development across one’s lifespan (Elman et al, 1996). When we highlight the phrase multiple time scales, we want to underscore the importance of time in cognitive and linguistic development. As Elman (2003, 430) points out, “explaining why a behavior changes over time can be key to understanding the behavior itself”. Most higher–level cognitive processes evolve in time and are also constrained by time. A good example is speech production. Producing spoken language involves the reduction of dimensions in the mapping of a nonlinear channel—the phonetic– phonological, syntactic, semantic representations—to a linear channel such as the phono–articulatory tract (Bates & Goodman, 2001). Therefore, speech production does not consist of a simple sum of isolated sounds bound together in a sequence; rather, it is an orchestration of articulators—such as lips, teeth, tongue tip, blade and root—whose movements, or interaction of gestures over time, results in the production of sounds. It all means that the idea of a single, “pure” sound followed by another pure sound, and so on, does not match the dynamics of speech production, during which sounds influence and are highly influenced by the other sounds that follow and precede one another. L1 and L2 learners make sense of meaning in the ambient language by extracting patterns from the flowing dynamics of sounds in time.

Cognition and second language acquisition

  • Upload
    ufrgs

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

chapter one

cognition anD seconD language acquisition

1.1 An emergentist view of cognition

Emergentism in science is a wide framework for thinking about different areas of knowledge, such as physics, geology, biology, and cognition, to name a few. According to MacWhinney, “emergentist accounts emphasize complex interactions between multiple factors across multiple time scales” (2006, 731). The phrases “multiple interactions” and “multiple time scales” incorporate the core features of this approach to cognition and language acquisition.

The idea of multiple interactions involves a continuum between micro and macro structures and mechanisms, ranging from interactions at the gene level to interactions between the learner and the environment surrounding him. It also indicates a strong reliance on domain general mechanisms in human cognition that permeate the emergent patterns of learning and development across one’s lifespan (Elman et al, 1996).

When we highlight the phrase multiple time scales, we want to underscore the importance of time in cognitive and linguistic development. As Elman (2003, 430) points out, “explaining why a behavior changes over time can be key to understanding the behavior itself”. Most higher–level cognitive processes evolve in time and are also constrained by time. A good example is speech production. Producing spoken language involves the reduction of dimensions in the mapping of a nonlinear channel—the phonetic–phonological, syntactic, semantic representations—to a linear channel such as the phono–articulatory tract (Bates & Goodman, 2001). Therefore, speech production does not consist of a simple sum of isolated sounds bound together in a sequence; rather, it is an orchestration of articulators—such as lips, teeth, tongue tip, blade and root—whose movements, or interaction of gestures over time, results in the production of sounds. It all means that the idea of a single, “pure” sound followed by another pure sound, and so on, does not match the dynamics of speech production, during which sounds influence and are highly influenced by the other sounds that follow and precede one another. L1 and L2 learners make sense of meaning in the ambient language by extracting patterns from the flowing dynamics of sounds in time.

Chapter 12

How can the cognition of language processing and, in a narrower sense, speech production be viewed in an emergentist framework? First of all, by viewing the brain as a complex cognitive system endowed with plasticity, which is to some extent shaped by experience (Elman et al, 1996), that is, resulting from multiple interactions distributed across different cortical areas. Specialized modules in the brain can thus be viewed as an emergent result from such interactions and the human experience of learning (Hernandez, Li & MacWhinney, 2005; Ellis & Larsen–Freeman, 2007). Second, cognition can be viewed in an emergentist framework because it focuses on the process of learning, that is, on the role of cognitive change over time. One of the various approaches which emphasize the process of learning over time within the broader emergentist framework is connectionism.

The connectionist paradigm, aided by computational simulation of neural networks, has provoked a great deal of debate in psychology and linguistics by claiming that processes such as learning and cognition are constrained by the way neurons operate in the brain. Connectionist studies are carried out taking into consideration the physical environment in which these processes occur. The principles that govern cognition are those specifying the processes taking place in these units and their connections, such that complex behaviors emerge out of simple units. According to Broeder and Plunket (1994), an important feature of connectionist systems is their capacity to generate spontaneous generalizations by extension from particular experiences. Such a capacity makes cognition extremely flexible and plastic. Learning, under this perspective, implies changing existing connections among units in neural networks. Previous knowledge therefore plays a fundamental role, since every new piece of information is integrated into the neural network, changing the knowledge encoded therein. These changes are not abrupt: they unfold in time. The notion of timing is thus materialized not only through the architecture of the neural network, but also through the existing knowledge that has been gradually processed over the learner’s lifetime. Thus, learning evolves in time and is dependent on the cognitive structure, on the input structure, and on the learner’s experience.

After a brief account of the main features of emergentist principles regarding cognition, it is now time to turn to how connectionism envisages language acquisition.

Cognition and Second Language Acquisition 3

1.2 Language acquisition and emergentism

Emergentism conceives of language acquisition as being performance oriented (Seidenberg & McDonald 1999, 570). This is to say that the target in acquisition is the ambient language to which the learner is exposed. The “poverty of the stimuli”, one of the most important tenets of Generative Linguistics, is strongly rejected, as well as the notion of “competence”, and the postulation of a specific language acquisition device. As Seidenberg and McDonald (1999) point out, from the connectionist perspective, learning a language does not imply the acquisition of an abstract system of linguistic rules. The postulation of the existence of underlying deep structures derived in various strata through generative rules, which would be in turn responsible for surface structures, is no longer required. Instead, language acquisition is guided by probabilistic constraints and regularities in the ambient linguistic input.

According to Christiansen, Allen and Seidenberg (1998, 261), there are sufficient details which the learner can extract from language in use, so as to overcome the apparent “poverty of the stimuli” postulated by Chomsky (1975) in the context of arguments for a generative grammar. It is also imperative to stress that the reliance on statistical and probabilistic information present in the input entails continuity between learning and processing. Seidenberg and McDonald (1999, 576) state that both the processes of language learning and language use involve the same cognitive mechanisms. This should not be surprising, given that the ambient language is not only the source of information from which generalizations can be made but is also the actual target of language acquisition. Furthermore, the absence of certain structural patterns in the input works as implicit negative evidence, since the reinforcement of forms present in the input tends to lower the probability of the emergence of those structures which cannot be observed in the ambient language. In short, learning can be seen as the result of the capacity to observe simultaneously multiple probabilistic constraints, so that aspects which would not be relevant when considered in isolation become relevant when they are processed along with other probabilistic aspects also present in the input.

Linguistic input takes on a fresh role in emergentism: it is seen as being sufficiently rich to drive learning, based on its probabilistic information. This assumption can be summed up in four main claims: 1) the linguistic environment is rich in distributional regularities which guide language learning; 2) language acquisition requires the exploration of probabilistic constraints contained in the various types of linguistic and nonlinguistic

Chapter 14

information; 3) it is hard to make a clear distinction between linguistic and nonlinguistic knowledge, since learning depends on both the structured input and on the learner’s previous knowledge; 4) the distributional information can provide implicit negative evidence (Rhode & Plaut, 2003; Seidenberg & Zevin, 2006). Some of these principles rely on the concept of statistical learning, which was studied by Harris (1955) in the structuralist–distributionalist tradition, and was later revisited by Saffran and colleagues (Saffran, 2001; Seidenberg, MacDonald and Saffran, 2001; Saffran, Aslin and Newport, 1996).

As for the traditional notion of phonological acquisition in a symbolic paradigm, it seems that an input representation conceived as a deep structure far apart from the output form cannot be accepted without reservation. An input form cannot be distinct from the oral stimulus to which the learner is exposed. Joanisse (2000) makes it clear that the ambient language is not only the starting point from which the learner will extract the regularities of the linguistic system, but it is also the end–point, or target, of language acquisition itself.

It follows from these arguments that it is pertinent to bring into this discussion issues such as age of the language learner (the critical period), access to Universal Grammar (UG), and the concept of markedness, all of which issues are considered, from a symbolic perspective, but not in emergentist aproaches, to influence the ultimate L2 attainment . These issues will be discussed in the following section 1.3, where the notion of transfer, as emergentism assumes that linguistic knowledge is woven from the same well as the relationship between input perception and output production, will be reviewed.

1.3 SLA and the notion of transfer

In this section, the acquisition of an L2 draws on the notion of statistical learning—which is guided by statistical and probabilistic aspects in the input—and the notion of transfer from L1 knowledge into the construction of L2 knowledge.

1.3.1 L2 Learning

According to Gasser (1990), L2 knowledge is acquired in a similar way to L1 knowledge. L2 acquisition also relies heavily on L2 stimuli, that is,

Cognition and Second Language Acquisition 5

the input plays a paramount role in an emergentist approach to SLA. These assertions could lead one to erroneously infer that the production of L2 items would merely consist of copying the input, without moderation by any single cognitive barrier. However, if that were the case, students would always produce the target language exactly as native speakers do. Intra and extralinguistic aspects are heavily influential on the language production of learners, as stated by Celce–Murcia et al. (1996). Some of these aspects are regarded as crucial in the symbolic paradigm. The postulation of a critical period as theoretical support for the maturational program in language acquisition, the discussion of access to Universal Grammar, and the concept of markedness are all issues that are tackled in a distinct fashion by emergentist and connectionist views of SLA.

We agree with Munro and Bohn (2007) that a great deal of research on age and speech learning revolves around issues concerning the fact that L2 learners who learn a language after early childhood—the so–called late learners—show patterns of speech perception and production which are distinct from early learners. The matter of the problem resides in how differences in performance concerning age–related effects are interpreted. The Critical Period Hypothesis has been presented as one of the main explanations why older learners, despite being initially faster at learning the L2 system, tend not to reach degrees of proficiency as high as those attained by younger learners. This hypothesis seems to account for the existence of both a genetically determined maturational program and innate language acquisition mechanisms which would facilitate L2 learning during childhood (Jonhson and Newport, 1989; Newport, 1990). Flege and colleagues, on the other hand, in their numerous studies on the effects between AOA (age of arrival) and AOL (age of learning) accent, have provided a different explanation for the general better performance of early learners, based on the role of the experience with the mother language (Flege, 1998, 2003; Flege, Munro, MacKay, 1995; Flege, Yeni–Komshian & Liu, 1999; Flege,1988). Connectionist researchers (Rhode and Plaut, 1999), in turn, claim that production data relative to effects of late exposure to an L2 can be explained without falling back on maturational changes or innate devices. In their view, L2 adult learners may fail to reach high levels of proficiency simply because their cognitive system has been largely committed to other cognitive tasks—including L1 comprehension and production. Children, in turn, are more likely to reach a better performance level owing to the fact that their cognitive system is not entirely entrenched in L1 knowledge. This view is in accordance with the findings by many scholars, connectionist and other renowned researchers in SLA (Seidenberg and Zevin, 2006; MacWhinney,

Chapter 16

2006, 2007; Flege, 2002, 2003; McClelland, 2001; Marinova–Todd et al., 2000; Plaut and Kello, 1999) that the brain retains its plasticity throughout life. Therefore, differences between proficiency levels reached by adults and children are more likely to be related to the amount of available data in the ambient language rather than differences in terms of innate abilities or a maturational program. In short, learners’ oral production may be directly linked to their linguistic experience with the source language as well as the target language.

Concerning the stipulation of a Universal Grammar (UG), the generative approach assumes that human beings inherit the set of principles and parameters which guide language acquisition. The principles are the same for all languages; language acquisition consists of the specification of these principles by setting binary (yes/no) parameters. The debate on the role of UG in the SLA arena revolves around three distinct intra–theoretical viewpoints: a) the first hypothesis, stemming directly from the maturational hypothesis, postulates that L2 learners have no access to UG (e.g., Meisel, 1991); b) the second position, which establishes partial access to UG, insofar as only the values of those parameters instantiated through the L1 are available for later use by the L2 learners. This may account for the difficulties faced by learners in their task of resetting in the L2 those parameters which had been previously set for the L1 (Bley–Vroman, 1983; Schachter, 1989); c) the third hypothesis advocates full access to UG, which shapes L2 acquisition entirely (Epstein, Flynn and Martohardjono, 1996; Finger, 2003).

On the other hand, the emergentist framework for language acquisition dispenses with the UG construct, given its commitment to biological plausibility in the process of language learning. Ellis (1999) states that such hypotheses do not present any plausible explanation in terms of cognitive processing, given that a) cortical modularity and specialization may be the result, and not the cause, of learning and developing automaticity; b) functional imaging studies are revealing a whole range of different brain areas involved in language processing, extending beyond the classical language centres; c) considerable individual differences have been found in cerebral cortical specialization; d) none of the implicated brain areas are employed solely by language, but are also involved in other cognitive functions.

Another issue frequently discussed in the field of SLA is markedness. Within the generative tradition, Eckman (1986, 198) defines markedness as follows: “A phenomenon A in some language is more marked than B if the presence of A in a language implies the presence of B; but the presence of B does not imply the presence of A.” Silveira (2004) provides a good

Cognition and Second Language Acquisition 7

example of markedness by explaining that a language like English, which has three–element consonantal sequences in word–final position, is more marked than Brazilian Portuguese, which in turn allows only sequences of two consonants in the same position.

This classical view of markedness is also disregarded by emergentism. Joanisse (2000) claims that markedness could be defined by the statistical information contained in the structured input. By this term we mean input containing relevant information, such as its articulatory complexity, its frequency of production in some contexts, and absence in others. This kind of information can be gradually perceived and tallied by the learner (Ellis, 2005). Such ability configures the power of statistical learning, and is certainly performance–oriented. Therefore, more or less marked structures do not have to be associated with the existence of an UG; we can revisit the concept of markedness by associating it with the human capability for tallying important and salient input characteristics, guided by the frequency and consistency of certain segments, gestures or segmental sequences. Thus, consistent and frequent sequences tend to be easier to produce. Phonetic constraints active during the production and perception processes define some structures as being more complex than others. The least frequent and consistent structures are seen by connectionism as the hardest to be acquired. The structures hardest to produce with respect to acoustics and articulation, which are less frequent and articulatorily more complex, are said to be especially prone to changes.

In fact, markedness is such a broad and tautologic concept that Haspelmath (2006) describes twelve different senses for the term markedness, grouped into four larger classes: markedness as complexity, as difficulty, as abnormality, and as a multidimensional correlation. We agree with Haspelmath (2006) that

the term ‘markedness’ is superfluous, because some of the concepts that it denotes are not helpful, and others are better expressed by more straightforward, less ambiguous terms. In a great many cases, frequency asymmetries can be shown to lead to a direct explanation of observed structural asymmetries, and in other cases additional concrete, substantive factors such as phonetic difficulty and pragmatic inferences can replace reference to an abstract notion of ‘markedness’ (25).

The discussion above on the (ir)relevance of markedness leads us to the conclusion that this construct does not help to clarify the issue of interlanguage processes in second language acquisition; rather, it causes a great deal of confusion and ambiguity. In the same way, we claim that

Chapter 18

constructs such as age, markedness and access to UG do not parallel the importance of the learner’s linguistic experience, in terms of exposure to the target language. Such experience is the product of the learners’ capacity to generalize from the data available in the input.

The ability to generalize is fundamental to human cognition. Being that one of the goals of this book is to highlight the role of explicit instruction in minimizing oral productions which are deviant from the target language, it is worth pointing out that such deviant productions result from the learner’s experience with the L1 and L2 inputs, which in turn give rise to the generalization of knowledge from the L1 into the L2 (L1–L2 transfer), and also to overgeneralizations of L2 input. The first type of these generalizations—which will be dealt with in the next section—is interlanguage transfer of both phonetic–phonological and grapho–phonic–phonological knowledge, whereas the second type of generalization is intralanguage transfer.

Although intralanguage transfer is not the main focus of this book, it is important to emphasize that not all deviant productions in the L2 emanate from the experience with the L1. The oversystematization of regularities perceived in the L2 input also results in pronunciation errors. A good example is the tendency of Brazilians to stress the first syllable instead of the last one, as in the word hotel (Silveira, 2004), even though the correct stress pattern for this word in Brazilian Portuguese is exactly the same as in English. The effect of L1–L2 transfer will be described and discussed in the following section.

1.3.2 The Transfer of L1–L2 Knowledge

This section discusses the two forms of interlanguage transfer which seem to influence most strongly deviant L2 oral production: 1) phonetic–phonological transfer, which occurs during L2 oral production; 2) grapho–phonic–phonological1 transfer, which happens during oral reading in the L2, but can also occur during speech production itself. These two types of transfer should not be seen as totally distinct entities, to be considered separately, since the acquisition of a single L2 phonetic–phonological feature can involve both sorts of transfer simultaneously. However difficult the theoretical task of separating these types of transfer may be, the conceptual distinction between them remains germane because it enables researchers and teachers to understand L2 learners’ interlanguage2 systems more clearly.

Cognition and Second Language Acquisition 9

1.3.2.1 L1–L2 Phonetic–Phonological Transfer

Gasser (1990, 189), among many other researchers, has stated that the transfer of L1 patterns into the L2 system is precisely one of the aspects of SLA which connectionist simulations can best replicate. The new linguistic patterns of L2 are perceived by the learner in a way which is biased towards the L1 patterns, which are deeply entrenched in the learner’s cerebral cortex. A foreign accent can thus be described as the product of the activation of acoustic–articulatory patterns which are identical or very similar to the preexisting L1 patterns, since the learner treats the L2 lexical items as if they consisted of strings of L1 acoustic–articulatory units (Zimmer, 2004).

A good example of phonetic–phonological L1–L2 transfer occurring at the segmental level is the acquisition of the dental fricatives [] and [] by Brazilian learners of English. Zimmer (2004) found that Brazilians may replace these fricatives by a set of segments such as [], [], [] and []. Productions such as [f] instead of [], or [d] instead of [], may be strongly connected to the role of perception in categorical contrasts, in terms of segments, between the L1 and the L2. This happens because of the tendency L2 learners have to assign to L2 phonemes the segmental patterns of their L1, so that L2 phones tend to be assimilated to a similar L1 category, that is, they are perceived as if they were L1 phones. This suggests that most deviant L2 phonetic productions seem to surpass what can be explained by the mere articulatory difficulty of the learner, as shall be presented in detail in the next section. However, this latter suggestion does not rule out the possibility that some failures in producing the target forms may be the sole result of articulatory difficulties learners sometimes face. Still regarding the production of the target [], the production of [t] rather than [] seems to be a consequence of spelling interference, that is, grapho–phonic–phonological transfer, which will also be examined in the following section.

Another interesting example of L1–L2 transfer is the transfer of syllabic patterns from the L1 into the L2. Zimmer (2004, 65) lists processes of simplification of consonant clusters resulting in epenthesis (e.g., [] for the target “school”) and final epenthesis after obstruents (plosives, fricatives and affricates), (e.g. [] for the target “got”). The syllable repair strategies discussed here consist of a process of adjustment of the Brazilian Portuguese syllable pattern. Brazilian Portuguese does not allow initial /sk/ sequences, and its coda inventory is restricted to /l/, /r/, /N/ and /S/, besides the semivowels /, w/ (Collischonn 2001, 108).

It is worth mentioning that epenthesis in words such as “take”, “have” and “base” need not always be subject to the same explanation as that just

Chapter 110

given, since, unlike words such as “ask” and “risk”, the former words have a final orthographic but unsounded vowel. The presence of an epenthetic vowel might thus be conforming to the vowel existing in orthography, which presence is a result of the so–called GPC3 rules, rephrased here as grapho–phonic–phonological transfer, a topic to be further discussed below.

1.3.2.2 L1–L2 Grapho–Phonic–Phonological Transfer

Zimmer (2004) made an empirical and a computational study investigating the L1–L2 transfer of grapho–phonic–phonological knowledge (Zimmer and Plaut, 2007). The evidence garnered from those two approaches suggests that it is not only phonological knowledge that underlies transfer during L2 speech and reading aloud, but also the principles of the L1 and L2 alphabetic systems. Although Brazilian Portuguese and English make use of the same alphabetic system, the relationships between orthography and speech production are distinct: in Brazilian Portuguese, the correspondence between graphemes—that is, letters or strings of letters representing a phoneme and its phonetic production—is more transparent than in English, where the correspondence can be highly opaque, due to historical aspects of orthography. Therefore, in L2 speech and oral reading, Brazilian learners tend to assign the same phonemes they would activate when speaking or reading in their L1, as a result of their entrenched L1 alphabetic knowledge.

Going back to the example mentioned above, concerning the final epenthesis in words such as “take” [], it is now clear that two sources of transfer might be in action: 1) L1–L2 pattern transfer, a case in which an epenthetic vowel is produced to conform to a syllabic structure of the L1; this is also the case with words like ask and risk; 2) L1–L2 grapho–phonic–phonological transfer, in which case the final vowel in [] is a result of spelling. If the latter kind of transfer is the only one still active in the learner’s interlanguage systems, then words such as ask and risk would no longer present final epenthesis, since they contain no final vowel in their spelling.

Silveira (2004) collected data from Brazilian EFL learners to test the hypothesis that orthography is an important variable that contributes to the occurrence of vowel epenthesis in English words ending in a silent –e grapheme (e.g., take). Her study shows that orthography appeared to be a relevant factor in determining the rate of vowel epenthesis, since words ending with a consonantal grapheme followed by a silent “e” triggered significantly higher epenthesis rates than those ending in a consonantal

Cognition and Second Language Acquisition 11

grapheme only (e.g., risk) in the pre and post–tests of the control group, and in the pretest of the experimental group, who received instruction and exercises in this matter. Nevertheless, the results for the experimental group post–test indicated that pronunciation instruction diminished the effects of orthography on the production of word–final consonants. Moreover, spelling also caused participants to transfer L1 processes such as the deletion of nasals, with the preceding vowel assimilating the nasal feature, and the substitution of alveopalatal affricates for alveolar stops.

In another investigation, Silveira (2007) examined the extent to which the variable orthography can explain the pronunciation difficulties faced by Brazilian learners when pronouncing English words. A case study design was adopted so that a wide variety of sounds could be analyzed. The participant, a Brazilian who started learning English as an adult, was recorded on three different occasions when he was interacting with a native speaker of British English during private lessons (free–speech task). In addition, the participant was recorded reading a text aloud (reading–aloud task). The data were analyzed in order to (1) identify consonants whose pronunciation may have been affected by orthography, (2) verify whether the mispronunciation of these sounds could be attributed to transfer of the L1 sound–spelling correspondence, and (3) examine whether these sounds were pronounced differently in the free–speech and reading–aloud tasks. The results suggested a strong effect of L1/L2 grapho–phonic–phonological transfer while indicating that the pronunciation of some word–final consonants (//, //, //, // and //) was more commonly affected by orthography than others.

Apart from Silveira (2004, 2007) and Zimmer (2004, 2007), many other researchers have been looking into the role of orthographic knowledge in L2 speech production, underscoring its interacting effects with an array of variables which seem to influence on L2 learners’ performance when they speak or read and L2 (Chikamatsu,1996; Erdener & Burnham,2005; Escudero & Hayes–Harb, 2007; Detey & Nespoulous, 2008). As the purpose of this chapter is to provide an overview of the cognitive assumptions underlying the theoretical background that guides chapter 2 and 3 of this book, we are not going to provide an extensive review of all the significant papers on the effects of orthography in L2 speech production. However, the construct we call grapho–phonic–phonological transfer is a key one, and several sections of our units in chapter 3 try to deal with the orthographic influence on L2 pronunciation.

It is relevant to note that, even in L2 words for which final consonant cluster is allowed in Brazilian Portuguese, such as the [rs] sequence in the

Chapter 112

English word farce, an epenthetic vowel is likely to be produced owing to the activation of the L1 spelling system. In addition, it is likely that the learner faces an initial learning stage in which both types of interference are active in his/her interlanguage system.

In summary, this section aimed to show that deviant L2 phonetic productions can be derived from two different sources: L1–L2 phonetic–phonological transfer and L1–L2 grapho–phonic–phonological transfer. These types of transfer can take place in isolation or simultaneously, as shown in the examples presented in this section. This discussion sets the stage for considering the non–targetlike phonetic productions which seem to arise from the relationship between input perception and output production.

1.4 Input perception and output production

The link between L2 perception and production is indeed vital to understand the acquisition of L2 structures. Many researchers attribute difficulties in mastering target sounds of L2 segments to failures in speech perception. For example, Zimmer (2004) views segmental changes in the interdental fricatives mentioned above as a typical example of perceptual assimilation (Flege, 2002, 2003; Best et al., 2001; Kuhl, 2000). Flege (2003) claims that L2 speech production is strongly constrained by the learner’s perceptual accuracy. His Speech Learning Model is grounded on two basic premises: 1) L2 learners cannot separate their L1 and L2 phonetic subsystems, since the phonic elements in these subsystems interact in a common phonological space; 2) although the mechanisms responsible for speech production are intact throughout the learners’ life, “the formation of prototypical categories of L2 phones becomes less likely as age increases” (Flege, 2002, 11). According to this model, as the perception of L1 phones develops throughout childhood and adolescence, the assimilation of phonetic qualities of L2 consonants and vowels is more likely. If certain L2 phones continue to be identified as instances of L1 phonemes and allophones, the formation of new categories will be blocked. However, Flege makes it clear that such limitations in L2 categorical speech perception originate from the learners’ previous experience with their mother language, rather than as the result of a maturational program.

Best et al. (2001) , like Flege, place a great emphasis on the influence of the mother language on constraining the perceptual discrimination of nonnative phones. However, it is important to note that instead of focusing on the

Cognition and Second Language Acquisition 13

L2 learners’ perception, the Perceptual Assimilation Model (PAM) focuses on naïve monolinguals, which is an important difference–although mostly unnoticed in many papers–between the two models. Although nonnative and L2 speech perception have been generally treated as interchangeable terms, they make predictions about two radically different groups of participants: naïve listeners have had little or no contact with the L2 speech, while L2 learners tend to differ from them in terms of their experience with the L2 (Best & Tyler, 2007). PAM (Best et al., 2001) departs from the premise that the discrimination of nonnative speech sounds depends on how, or whether, these phones are perceptually “assimilated” by the L1 phonemes. The model (Best et al., 2001) predicts that accuracy in phonological discrimination can be influenced by the level of phonetic articulatory resemblance between L1 and nonnative phones. Hence, success in discriminating nonnative phones is related to the way a nonnative contrast is assimilated by L1 phonological categories. Thus, different degrees of discrimination are predicted on the basis of distinct types of perceptual assimilation of each phone in a contrasting nonnative pair. Two Category (TC) assimilation yields very good to excellent discrimination, and happens when two nonnative phones are discriminated as instances of two distinct L1 phonemes. Single Category (SC) assimilation is associated to poor discrimination, which takes place when the two nonnative phones are perceived as two equally good exemplars of the same native phoneme. Category Goodness (CG) assimilation, in turn, yields intermediate discrimination because both phones are judged to be good exemplars of one L1 phoneme, but not in the same proportion, so that one is perceived to be a better exemplar than the other. Uncategorized–Categorized assimilation takes place when a naïve listener fails to categorize one of the phones in the nonnative pair, while perceiving the other phone as an L1 phoneme. While the model predicts very good discrimination for this pattern of assimilation, it does not do so for the Uncategorized–Uncategorized pattern, in which both nonnative phones fail to be recognized as good exemplars of a phoneme (Best & Tyler, 2007). It is important to underline that although both SLM and PAM underline the role of experience with the L1 as shaping the perception of L2 and nonnative phones, their predictions are not restricted to L1 phonological contrasts; rather, both have considered at length, and in numerous publications, the importance of non–contrastive phonetic similarities and dissimilarities between L1 and nonnative/L2 phones, including notions of phonetic goodness of fit, and the relationship between phonetic details and phonological categories and contrasts (Best and Tyler, 2007, 22).

As in this book we depart from an emergentist view of second language

Chapter 114

acquisition, such models are in consonance with our view on the interaction between phonetic–phonological and grapho–phonic–phonological transfer during speech production.

Following the same line of thought as Best et al. (2001), and Flege (2002, 2003) on the role of the mother language, the Perceptual Magnet Theory (Kuhl and Iverson, 1995; Iverson et al, 2003), sets out the hypothesis that the L1 phonetic prototypes act like magnets, distorting perception of items in their vicinity within phonological space so as to make them more similar to the L1 prototype. According to this model, the perceptual mapping that babies make of ambient language creates a net or a complex filter through which language is perceived. This perceptual attunement to L1 categories may later shape the discrimination of L2 phones. Transfer may arise from the inherent difficulty of separating functionally the mappings of L1 and L2 categories, and also because a neuronally–based commitment to categorical L1 mappings influences the processing of L2 phones (Flege, 2002, 2003). It is important to note that Kuhl and Iverson (1995) and Iverson and colleagues (2003) agree with Flege’s suggestion that the constraints on L2 perception emerge from the learner’s previous linguistic experience.

To this discussion, McClelland (2001, 106) adds that adults sometimes cannot perceive L2 phones because they have “years of experience carving up phonological space according to the phonological structure” of their native language. In the course of investigating the difficulty encountered by Japanese adult learners in pronouncing the English liquids /l/ and /r/, McClelland carried out a connectionist simulation based on the concept of Hebbian4 learning in which he not only replicated the empirical findings related to perception and production of the distinction between the English liquids, but also designed a treatment to overcome these problems through massive exposure of learners to an artificially synthesized input with exaggerated acoustic cues, as we will see in the following section.

In summary, the studies and findings of these four influential scholars in the field of SLA research—Flege, Best, Kuhl and McClelland—converge on the key role played by L1 categorical perception over L2 speech production. Therefore, misperception can exert considerable effects both on L2 speech and reading aloud, paving the way for phonetic–phonological as well as grapho–phonic–phonological L1–L2 transfer. Given the discussion above, one may well enquire how L2 teachers can endeavor to enhance their learners’ perception of the target forms. This necessary boost in perception, as we have mentioned, is believed to speed up the acquisition process. The section which follows addresses this question by raising the issue of explicit instruction.

Cognition and Second Language Acquisition 15

1.5 The role of explicit instruction

To start this discussion, it is crucial to define what we mean by “explicit instruction”, since this term is to be used throughout the book. Following Alves (2004) and Zimmer and Alves (2006), “explicit instruction” can be interpreted as an umbrella term encompassing more than just the task of formally systematizing the linguistic system itself. In this sense, within a communicative L2 environment, “explicit instruction” surpasses the teacher’s task of formalizing the L2 system, since it also includes the composite of other teaching procedures aiming to highlight, review or draw the students’ attention to specific aspects of the target language otherwise tending to remain unnoticed by learners.

In view of the objective of this book to discuss the teaching stages of presentation, practice and production of phonetic/phonological aspects, it is necessary now to mention how Celce–Murcia et al. (1996) propose a framework for pronunciation instruction that encompasses focus on form plus integration with the remaining components of the L2 syllabus. This framework includes five stages: (1) description and analysis (i.e., awareness raising), (2) listening discrimination, (3) controlled practice and feedback, (4) guided practice with feedback, and (5) communicative practice and feedback. Stages (1) and (2) provide learners with explicit information about specific phonological components, when and how these can occur, as well as examples of the targeted components. This is an occasion for learners to gain knowledge about when and how certain L2 phonological features are used, with the focus placed on the perception of the sounds. The subsequent stages (3), (4) and (5) focus on production, beginning in a very controlled and elementary way, which proceeds first from minimal pair practice to the production of contextualized, meaningful sentences. In these final stages, teacher feedback is very important to maximize the probability that two other types of learning can take place: tuning and restructuring of the target L2 phonological features. As Celce–Murcia et al. (1996) observe, the selection of the pronunciation components, as well as the communicative functions and the lexical items included in the pronunciation syllabus, should be in accordance with the learners’ proficiency level and interests, so that motivation is not hindered.

Although the initial stage of L2 description and analysis is imperative, little is to be achieved without progression to the other four stages mentioned above. Therefore, all the aforementioned pedagogical steps are topics for the explicit instruction of learners. It is now appropriate to discuss the role of explicit instruction in terms of input perception and processing from

Chapter 116

a connectionist perspective. The concept of Hebbian learning, explored by connectionist simulations, strongly supports the notion of biased reinforcement. According to McClelland (2001), Hebbian learning consists of synapse–modifying mechanisms which reinforce the pattern of neuronal activity that a given input has provoked or initiated. Studies of long–term potentiation (LTP) suggest that the stronger the activation triggered by a given input, the more powerful and lasting the reinforcement effect will be. Thus, there will be a greater probability that a subsequent and similar input produces the same activation pattern. If the activation is adequate and useful, the learning and maintenance of cognitive skills will occur; however, if the activation pattern is excessively biased by prior learning, Hebbian learning will tend to reinforce these imperfect prior trends, hindering progress in acquiring the desired new patterns.

McClelland (2001) claims that the difficulty in fully acquiring L2 phonology may result from inappropriate reinforcement of prior L1 activations. The author simulated the difficulty which Japanese adults learners of English have in acquiring the distinction between [l] and [r], in an attempt to understand why the perception and production of specific L2 speech sounds is one of the greatest challenges to be faced by L2 learners. As an answer to this question, McClelland (2001) suggests that this difficulty can be attributed to the role of the mother tongue in L2 learning, since the Japanese inventory does not contain the two consonants in question. This relative deficiency in inventory causes the Japanese learner to collapse both phonemes into one. McClelland claims that the reinforcement of prior L1 activations, which in this case would play a detrimental role, can account for the failure to distinguish between the two phonemes. Unable to perceive the difference between the two sounds, the learner instead activates the sound representation closest to the one found in his/her mother tongue, and thus inappropriately reinforces the entrenched patterns. How then could this learner ever hope to acquire the target forms?

It is likely that the L1 patterns will remain inappropriately reinforced unless the difference between the L1 and the L2 sounds is noticed5. Therefore, L2 noticing must be regarded as the first step towards reducing these undesirable reinforcements. As mentioned above, McClelland (2001) successfully demonstrates this claim through modeling a treatment in which networks were exposed to exaggerated stimuli aiming to help learners notice the L2 aspects more easily. He also carried out an empirical study consisting of a control group which received genuine L2 input and an experimental group which was exposed to synthesized L2 input, with modifications tending to exaggerate distinctive acoustic cues. The artificial input was

Cognition and Second Language Acquisition 17

gradually modified in each training section so as to make progressively more similar to the original input. In the end, members of the experimental group had learned in a faster and more efficient fashion than had the control group subjects, who showed hardly any progress in terms of perception and consequent production.

The evidence discussed by Alves (2004) and Silveira (2004) corroborates McClelland´s claims. It seems that, after the explicit instruction sessions, students started to notice the target forms; this led to the correct production of the target items. With regard to this observation, it is important to stress the need to explicate and systematize those L2 items which are not acoustically salient, and may therefore cause foreign accent in the L2.

The relevance of explicit instruction can also be supported, on cognitive grounds, by another model of learning and memory, one which focuses mainly on the brain architecture of complementary cognitive systems involved in language and learning. From this perspective, McClelland, McNaughton and O´Reilly (1995) formulated the Hipcort Model, which proposes that learning and memory arise from the interaction of two component memory systems. First, memories and new learning are processed by changes in synapses in the hippocampal system, which is specialized for rapid learning. Subsequently, the hippocampal changes are broadcast to the neo–cortex where the learned item is consolidated in long–term memory by a relatively slow and gradual process. Thus, new learning occurs initially in the hippocampus, ultimately resulting in the persistent formation in the neo–cortex of a memory trace of the event in a form which is amenable for subsequent explicit retrieval. The neo–cortex also participates in learning by working slowly to identify the structure present in ensembles of experiences; that is, it provides support for items that are already associated in pre–existing knowledge.

Learning occurs within the neo–cortex, although at a very much slower rate compared to the hippocampus, through the gradual accumulation of small changes in the strengths of connections in the cortex. Such learning can be described as implicit because the traces or engrammes are initially too weak to subserve explicit retrieval, although the small changes can nonetheless have an effect on memory performance. The process by which cortical learning takes place is thought to be similar to what happens in consolidation, in that it requires learning of multiple presentations of the item until sufficient strength has been built up.

In normal humans, the hippocampus guides cortical learning through a) the explicit reactivation of episodic memories, and b) the concomitant suppression of otherwise erroneous responses from the cortex. This allows the correct response to be evoked and the correct associations to

Chapter 118

be strengthened by Hebbian learning. But when the environment is not forthcoming with such support, the cortical system is left to fill in a response to a given cue; this internally–generated response tends to be the strongest association in its pre–existing contents. In such situations, the response elicited is likely to be the wrong one, and Hebbian learning would then work against effective learning by strengthening the erroneous association.

In the same fashion, phonetic–phonological transfer could be explained, in terms of complementary systems, as associative learning in the neocortex. When pre–existing knowledge (L1) is not in line with the associations (L2) being learned in the hippocampus, neo–cortical participation could lead to interference. In such cases, the hippocampal mechanisms may be insufficiently effective to overcome competing influences from the cortex, wherein the L1 knowledge is entrenched. With explicit instruction, new L2 items are bound together associatively in the cortex.

Explicit instruction can also contribute to enhance monitoring. As has already been noted, the hippocampus subserves explicit knowledge and rapid learning of new phonological information. Not only can such knowledge be reinstated into the neocortex, but it can also be used even before these new target forms become integrated into the neocortical system. Monitoring is imperative and can only be achieved through explicit instruction, since it allows learners to convert the knowledge they have stored in their hippocampal system into language production. This conversion process plays a relevant role in the noticing of the target forms in the input structures the student is exposed to, and, consequently, in the consolidation of L2 knowledge in the neocortex.

In summary, the studies carried out by McClelland and colleagues suggest that the difficulties inherent in learning L2 phonetic–phonological features can be explained both by the architecture of the complementary cognitive systems in the brain (based on the Hipcort model) and also by the notion of Hebbian Learning. The findings concerning these two models are able to explain the difficulties in acquiring the L2 phonetic–phonological forms, and also to support explicit instruction as a means to help overcome such difficulties. This section has thus discussed the benefits of explicit instruction in L2 learning. It has shown that explicit instruction can be useful not only concerning L2 production in monitoring environments (so that the target–forms can be produced in a short–term period), but can also contribute to a better noticing of such target structures, which may lead to the gradual consolidation of this knowledge in the neocortex in the long term, characterizing what is known as implicit learning.

We assert that the explication of details of L2 items can enhance input

Cognition and Second Language Acquisition 19

processing; that is, it can enable the L2 learner to notice such details, so as to decrease the transfer of L1 patterns into the L2. This notion of explicit instruction cannot be restricted to the formal description of the target forms. Rather, it encompasses all the communicative stages set out in Celce–Murcia et al. (1996), which we have already emphasized.