Upload
amu
View
0
Download
0
Embed Size (px)
Citation preview
1
The challenge of L2 sandhi – vowel hiatus in Polish English1
Geoffrey Schwartz1, Grzegorz Aperliński
1, Anna Balas
1, Arkadiusz Rojczyk
2
1 Faculty of English – Adam Mickiewicz University in Poznań
2 Institute of English – University of Silesia, Katowice-Sosnowiec
Abstract
The production and perception of L2 vowel hiatus, like other sandhi phenomena, is a largely
unexplored area in the field of L2 phonology. This paper presents a production and perception
study of vowel hiatus in the speech of Polish learners of English. Polish and English appear to
show systematic differences in the realization of hiatus. Textbook descriptions of English
refer to ‘glide insertion’ and ‘linking/intrusive [r]’ as the preferred hiatus fillers in traditional
vernacular dialects. By contrast, in Polish hiatus is typically realized with glottal stops,
glottalization, or increases in spectral tilt. These differences suggest that vowel glottalization
may be a form of L1 interference in Polish English. The production results indicate that the
suppression of glottalization is an integral part of L2 phonological acquisition for advanced
Polish learners. The listening tests revealed that the realization of vowel hiatus (modal vs.
glottalized) has an effect on L1 English listeners’ ratings of foreign accentedness, but L2
learners’ ratings of comprehensibility are largely unaffected by the glottalization of hiatus.
Issues of phonological representation underlying the study and its hypotheses are also
discussed.
1 Introduction
Experimental research into second language (L2) phonology has focused to a large degree on
segmental features and contrasts. Two of the more popular areas of study have been vowel
contrasts (e.g. Escudero and Boersma, 2004) and the implementation of laryngeal features
(e.g. Flege and Eefting, 1987). In both areas the somewhat elusive notion of cross-language
phonetic similarity has played a significant role in the formulation of predictions with regard
to success of acquisition. This type of research has become quite copious, as can be witnessed
in literature reviews found in volumes such as Hansen Edwards and Zampini (2008),
addressing acquisition from the point of view of production and perception, and exploring
didactic implications. By contrast, the transfer and acquisition of external sandhi, phonological processes
that span word boundaries, have been for the most part neglected in L2 speech research.
Beyond the relatively well-known areas of liaison and enchaînement in L2 French (e.g.
Howard, 2008; Shoemaker, 2010; Sturm, 2013), the acquisition of boundary effects in L2
speech remains largely unexplored. Zsiga (2011) notes this gap, and offers a review of
previous research in which only a few studies dealing with a small number of L1-L2 pairings
are discussed (Catalan-English: Cebrian, 2000; English-Russian: Zsiga, 2003; Spanish-
German: Lleo&Vogel, 2004; Korean-English: Zsiga, 2011).
One such sandhi phenomenon is the resolution of vowel hiatus at word boundaries.
This paper examines the acquisition of V#V sequences by Polish learners of English. Polish
and English appear to show systematic differences in the realization of hiatus. Textbook
descriptions of English refer to ‘glide insertion’ and ‘linking/intrusive [r]’ as the preferred
hiatus fillers in traditional vernacular dialects. By contrast, in Polish hiatus is typically
1 This research was supported by a grant from the Polish National Science Centre (Narodowe Centrum Nauki).
Project number 2012/05/B/HS2/04036
2
realized with glottal stops (Dukiewicz and Sawicka, 1995), glottalization, or increases in
spectral tilt (Schwartz, 2013a). Thus, acquisition of L2 English hiatus by Polish learners
entails the suppression of glottalization. This seemingly straghtforward cross-linguistic
difference is complicated by the fact that glottalization of vowel hiatus also occurs in English,
where it functions primarily as a marker of higher-level prosodic constituents (e.g. Dilley et
al., 1996). An additional complication may be observed in recent studies (e.g. Britain and
Fox, 2008; Davidson and Erker, to appear), which suggest that glottalization in English is
increasing in frequency in linguistically diverse urban areas. Nevertheless, since English
pronunciation instruction in Poland and many other countries is based on traditional
vernacular varieties, we shall consider non-glottalized hiatus, produced with modal phonation,
as the target for acquisition. Our study presents production data from 21 Polish students of English performing
both reading and imitation tasks. The reading task is intended to gauge the level of acquisition
of non-glottalized hiatus at the time of the experiment, while the imitation task seeks to elicit
modally produced hiatus by means of phonetic convergence with a model voice (Babel, 2012;
Honorof et al., 2011; Pardo et al., 2012). In addition, we present data from listening tests, in
which L1 English speakers rate modal vs. glottalized tokens on a scale of foreign
accentedness, while non-native speakers assess the comprehensibility of tokens containing
glottalized and non-glottalized hiatus. The production results suggest that suppressing
glottalization is indeed an integral part of L2 phonological acquisition for advanced Polish
learners, and that the production of modal voiced vowel is facilitated by both the imitation
task as well as the lexical frequency of the vowel-initial word. The listening tests revealed that
glottal vs. modal productions in vowel hiatus have greater impact on L1 listeners’ ratings of
accentedness than on L2 learners’ ratings of comprehensibility.
In introducing our experiments and discussing the results, we focus on issues of
phonological representation. In particular, we hypothesize that initial vowels in the two
languages are characterized by a representational difference; in Polish they are more
prominent prosodic entities that show a greater tendency than the corresponding segments in
English to be realized with glottalization. The representational hypothesis is formulated in the
Onset Prominence framework (OP; Schwartz, 2013b), which incorporates the phonetic
ambiguity of the initial portion of vocalic segments in CV sequences. This portion of the
signal, represented as the Vocalic Onset (VO) node of structure, is phonetically vocalic, yet
typically bears cues to the identity of the preceding consonant. We suggest that Polish and
English differ with respect to whether this node of structure is incorporated into the
representation of initial vowels. VO specification is posited for Polish vowels, and is reflected
in the tendency for glottalization and the formation of prosodic boundaries before word-initial
vowels. In English this node is absent, and initial vowels typically acquire an onset via
resyllabification or the ‘insertion’ of linking consonants. The rest of this paper will proceed as follows. We start by providing background on
the phonetics and phonology of vowel hiatus in English and Polish. This is followed by a
description of the production experiment. Then we present a description of the listening tests
that were carried out. This is followed by general discussion on the wider implications of our
study for the field of L2 phonology. Finally, we conclude with a brief clarification of the
phonological approach underlying our hypotheses and the interpretation of the results.
2 Vowel hiatus in English and Polish
This section will present background on vowel hiatus in English and Polish. We start by
considering previous descriptions. This is followed by a presentation of conceptual issues
3
underlying the phonology of hiatus. Finally, we introduce the Onset Prominence framework,
and derive the representational hypothesis underlying the experimental studies.
2.1 Previous descriptions hiatus in the two languages
Phonological descriptions of English hiatus resolution (e.g. McCarthy, 1993) typically refer to
a categorical process of glide insertion (see Ed [sijed], new image [nuwɪmədʒ]; McCarthy,
1993), as well as linking or intrusive /r/ in non-rhotic dialects (Uffmann, 2007). By contrast,
recent phonetic studies cast doubt on the assumed categorical nature of the process. Davidson
and Erker (to appear) present evidence against ‘glide insertion’ by English speakers in New
York City. Glottalization of hiatus is typically observed when the word-initial vowel is
stressed. In other cases, they found that VV and V#V sequences differ systematically from
VjV and V#jV sequences. We are left to conclude that traditional descriptions of ‘glide
insertion’ reflect the segmental transcriptions afforded to earlier authors’ percepts of non-
glottalized vowel hiatus. In other words, non-glottalized hiatus may have been misheard as
containing glides. The descriptions of hiatus found in works with more of an applied or pedagogical
orientation resist classifying ‘glide insertion’ as a categorical phenomenon. Cruttenden (2001)
notes that in vocalic sequences at word boundaries, slight linking glides might be audible, but
they are not as distinct as phonemic glides. He also mentions that linking glides in English
might be substituted with a glottal stop, especially before a vowel in an accented syllable.
Cruttenden also states that the use of glottal stops in English is not as common as in some
other languages, e.g. German, and is limited to emphatic utterances. Lecumberri and
Maidment (2000: 64) see utterance-initial glottal stop insertion in emphatic speech as a
universal phenomenon. Wells (2008: 345) claims that glottalization is optionally used to add
emphasis to syllable-initial vowels or to avoid hiatus in neighboring syllables (V#V).
According to Wells, British English speakers might use a glottal stop to avoid r-liaison.
Roach (2009: 117), in weighing the relative importance of various elements of
English pronunciation, suggests that sandhi processes should be a priority: ‘It would not be
practical or useful to teach all learners of English to produce assimilations; practice in making
elisions is more useful, and it is clearly valuable to do exercises related to rhythm and linking’
(emphasis ours). However, in the chapter on linking he does not address V#V sequences.
Instead, he focuses on linking and intrusive /r/ and the effect of a word boundary position on
the realization of consonantal allophones in pairs such as keep sticking and keeps ticking.
Cook (1991: 66) claims that in English two vowels across a word boundary are connected
with a slight glide, but warns against exaggerating it. Hewings and Goldstein (1999: 91) also
report that speakers join vowels across word boundaries with a ‘very short’ glide to ‘link the
vowels together to make the flow of speech smoother and to avoid a gap between the words’.
On the other hand, a fairly widely used pronunciation course by Lujan (2006) devotes little
attention to C#V linking, without even mentioning hiatus. As might be expected, there has been significantly less written about vowel hiatus in
Polish. In a phonological study, Rubach (2000) suggests that glide insertion is the norm for
word-internal hiatus, though he does not consider word boundaries, and no phonetic data is
presented beyond impressionistic transcriptions. Other descriptions (Dukiewicz and Sawicka,
1995; Gussmann, 2007), suggest that glottal stop insertion, though not obligatory, is the
preferred way of resolving hiatus. This may be observed both at word boundaries and at
word-internal morpheme boundaries. This suggestion is supported by data from Schwartz
(2013a).2 Data from 17 speakers showed a range of realizations from full glottal stops, to
2 Malisz et al (2013) look at initial vowel glottalization in Polish and German, and found a lower rate of
glottalization in Polish than for German. They did not do a comparison with English, though comparing their
4
glottalization realized as non-modal phonation, to drops in pitch and amplitude (cf.
Hillenbrand and Houde 1996). The data showed no signs of anything that might be interpreted
as a ‘glide’. In the most rapidly produced tokens without visible glottalization, the first vowel
in V#V sequences was often elided, while the second vowel was produced with increased
spectral tilt (cf. Sluijter and van Heuven 1996; Crosswhite, 2003). In sum, the literature on vowel hiatus in English and Polish suggests a slightly more
nuanced reality than the categorical claims of glide and glottal stop insertion found in some
phonological studies. Nevertheless, it appears that one systematic generalization may be
made. Modal hiatus appears to be the norm in English, which in earlier descriptions led to the
perception of an ‘inserted glide’. However, glottalization may also be found, typically before
stressed initial vowels. In Polish, glottalization dominates, and modal hiatus still entails the
marking of the initial vowel by means of raised spectral tilt. Thus it appears as if initial
vowels in Polish are more prominent prosodic entities than initial vowels in English. In what
follows, we offer a phonological interpretation of the hiatus phenomena.
2.2 Hiatus resolution – phonological interpretations
Phonological descriptions typically refer to the ‘resolution’ of vowel hiatus. This term
suggests a marked or non-optimal status for V#V sequences, in which the second vowel
violates ONSET, a prosodic constraint against vowel-initial syllables. The most frequent repair
strategies across languages include the coalescence of the vowels, or the appearance of a
consonant between them. Common hiatus fillers include glottal stops and [h]. The glides [j]
and [w] are also common when the first vowel is front or round. In essence, then, the
appearance of consonants in hiatus positions is generally taken to mean the ‘insertion’ of a
consonantal onset, be it a glottal stop, a glottal fricative, a glide, or a linking/intrusive rhotic.
In the parlance of Optimality Theory (Prince & Smolensky, 1993), insertions constitute a
violation of the faithfulness constraint DEP, which punishes output segments that are absent
from input representations.
From a logical standpoint, there is an alternative view on the origins of hiatus filling
consonants. Instead of the insertion of an entity that is absent from the underlying
representation, hiatus consonants may be interpreted as the realization of an entity that is
present in the underlying representation. That is, one may envision hiatus resolution as the
satisfaction of MAX, a faithfulness constraint that punishes the deletion of input features. In
the case of linking [r] in non-rhotic dialects of English (far – far away), this interpretation is
suggested by the presence of the letter r in the orthography. By contrast, other hiatus fillers in
English are generally accepted to be absent from the underlying input representations; their
appearance is assumed to be the result of insertion processes. The same interpretation is commonly applied to the appearance of glottal stops or
glottal marking in hiatus sequences. For example, the well-known process of harter Einsatz in
German, by which morpheme-initial vowels are realized with a glottal attack, is often referred
to as ‘glottal stop insertion’ (e.g. Wiese, 2000). We suggest that in languages in which glottal
marking is relatively common, including German and Polish, its appearance might better be
characterized as the preservation of an underlying element of the phonological representation.
In what follows, we will briefly introduce how this hypothesis is derived within the Onset
Prominence representational environment.
2.3 The OP environment and the representation of initial vowels
results with studies of English (e.g. Dilley et al., 1996) suggests that glottalization is more common in Polish.
Also, they did not specifically address the hiatus context, and they focused on glottalization as a marker of
higher-level prosodic structure in spontaneous speech and prepared (read) speeches.
5
In the OP environment, both segmental representations and prosodic constituents such as
syllables are constructed from the hierarchy in (1), which is derived from the phonetic events
observed in a stop-vowel sequence in initial position.
(1) The CV building block of the Onset Prominence representational hierarchy
Manner of articulation is encoded on the basis of the layers of structure contained by a given
segmental tree. This is shown in (2), which provides structures for a labial stop, nasal,
fricative, approximant, and vowel. The binary nodes are active elements in the individual
representations, while the unary nodes serve as place holders to indicate the relative
hierarchical position occupied by a given segmental structure. The segmental symbols may be
interpreted as shorthand for place and laryngeal specifications.
(2) Manner distinctions in the OP environment
In the hierarchy in (1), the segmental affiliation of the VO node is ambiguous. This single
layer of structure may be claimed by multiple segment types. On its own, VO represents the
class of approximants and glides as in (2). In (2), we also see that VO may be active in the
representation of obstruents and nasals, representing formant transitions that cue consonant
place of articulation (e.g. Wright, 2004). At the same time, the VO node is derived from a
portion of the signal that is phonetically a vowel. As a result, we might expect to find it built
into the representation of vowels as well as consonants. In (3) we see two types of vowel
representation, with or without an active VO node.
(3) Vowels with our without VO specification
6
The presence or absence of VO in vowel representations offers a useful perspective on the
prosodically ambiguous behavior of onsetless syllables across languages (see Schwartz
2013a). Briefly stated, VO-specification allows initial vowels to satisfy prosodic constraints
requiring onset consonants. This may be manifested as an apparent ‘empty consonant’ (e.g.
Marlett & Stemberger, 1983), or simply as prosodic well-formedness for processes such as
stress assignment or reduplication (cf. Downing, 1998 for discussion of prosodically ill-
formed onsetless syllables). With respect to the representation of initial vowels in Polish and English, we propose
VO specification for the former, but not the latter. That is, Polish initial vowels may be
thought of as containing a built-in consonant. One area in which this claim is manifest is in
the greater tendency for glottalization of initial vowels in Polish (Schwartz, 2013a). In
English, glottalization is associated primarily with higher-level prosodic constituents (e.g.
Dilley et al., 1996). Moreover, Polish word-initial syllables have been claimed to bear
secondary stress (Dogil, 1999),3 which in the case of initial vowels suggests a prominent
prosodic status as represented by the VO node in OP structures. The OP perspective on vowel hiatus is revealed when we look at the mechanisms by
which segmental structures combine into prosodic constituents. Consider (4). On the left we
see individual segmental structures for English cry, while on the right we see the word as a
single prosodic constituent. The fundamental process at work here is called absorption, by
which lower-level vowel and sonorant structures are fused with the higher level obstruents
that precede them. In cry, the vowel is absorbed into the rhotic, which is in turn merged with
structure of the initial stop. The absorption process shown in (4) is prosodically motivated. Without it, individual
segments such as /r/ or /k/ in cry would be forced to stand alone as prosodic constituents.
Absorption thus implies the presence of a minimality constraint for well-formed syllables.
Such a constraint combines two commonly invoked restrictions on syllable structure, both of
which are satisfied by means of absorption. The requirement for onset consonants is satisfied
by absorbing /aɪ/ into the structure of the rhotic, while a constraint against syllabic obstruents
is alleviated when the rhotic-vowel sequence merges with the /k/.
(4) Segmental and constituent (right) structures for English cry
When two OP vowel structures are adjacent, absorption may not take place since the
two trees are at the same level of the hierarchy. In such cases, a different mechanism,
submersion, may be motivated to ensure prosodic well-formedness. This is shown in (5). The
two trees on the left show a /ua/ sequence. The second vowel may not be absorbed, since the
structure to its left is not higher in the hierarchy. Instead, it is submerged under the first tree,
3 Newlin-Łukowicz (2012) found minimal acoustic support for Dogil’s claim. However, she did not measure
spectral balance, an established cue to prominence in many languages (Sluijter & van Heuven 1996, Plag et al.
2011) including Polish (Crosswhite, 2003).
7
producing the single structure /ua/ that we see in the third tree from the left. Such a sequence
may be perceived as [wa], yet its labial element is distinct from the lexical glide /w/ (the
rightmost tree), in which it is housed at the VO level. This representational distinction
captures the phonetic differences between underyling and hiatus glides observed by
Cruttenden (2001), and Davidson and Erker (to appear), as well as phonological differences in
other languages (cf. Levi, 2008).
(5) Submersion of the second vowel in a VV sequence
In (5), we see that submersion prevents the formation of a boundary at vowel hiatus.
Submersion is also the process that we claim as the origin of vowel quantity and weight
distinctions, which are a generally accepted feature of English, but are unattested in Polish.
When the second vowel in V#V sequences is specified with the VO, as we claim for Polish,
submersion is not motivated since the VO-specified vowel is already a well-formed prosodic
constituent. As such, it may be strengthened by means of promotion (Schwartz, 2013b), a
fortition process by which VO-specified vowels and approximants are raised to the highest
level of the OP hierarchy normally occupied by stop closure. Promoted vowels show a greater
tendency to be realized with glottalization. When glottalization is not produced, their
prominence may be preserved by increases in spectral tilt (Schwartz, 2013a). Beyond the mechanism shown in (5), submersion is a process with far-reaching
prosodic implications, offering additional insight into the behavior of consonants in both VC
and VCV contexts, with deeper predictions for the form and behavior of larger prosodic
constituents. We will return to submersion in the General Discussion section, considering its
implications for the study of L2 speech.
3 Experiment 1 – hiatus production in reading and shadowing tasks
The phonological considerations discussed above lead to a hypothesis that L1 Polish
interference may result in a tendency for glottalization of vowel hiatus in the speech of Polish
learners of English. To test this hypothesis we conducted an experiment in which we elicited
data from learners producing English V#V sequences. We compared learners from two groups
and in two tasks, reading and imitation. We hypothesize that glottalization will be more
prevalent in the speech of less advanced learners, and in the reading task. If these hypotheses
are supported, it may be taken as evidence in favor of representations for English that are
conducive to linking of V#V sequences, and Polish representations that favor glottalization as
a hiatus filler. The reading task may be assumed to provide information about the state of acquisition
at the time of recording, while the imitation task offers some perspective into learners’
potential for acquisition. In recent years, phonetics research has investigated the performance
of L1 speakers in shadowing tasks. The features that have been found to undergo phonetic
convergence include fundamental frequency (Goldinger, 1997), VOT (Shockley et al., 2004),
and vowel quality (Pardo et al., 2012). This research has investigated the role of episodic
8
memory (Goldinger, 1998) with the goal of testing exemplar-based models of speech
perception (e.g. Johnson, 1997). Findings of phonetic convergence with model productions in
shadowing tasks may be said to support the hypotheses of exemplar-based perception, insofar
as such productions reflect how they were perceived.
As yet, experiments employing the imitation paradigm have been largely limited to L1
speech. However, it is not difficult imagine the benefits of using imitation study in L2
phonological research. Repetition, of course, has always had a prominent position in foreign
language pronunciation instruction. This fact suggests a parallel with L1 phonetic
accommodation. At the same time, one might expect L1-based perceptual constraints (Best,
1995; Best and Tyler, 2007) to hinder L2 speakers’ convergence with a native stimulus. The
question that remains is to what extent these constraints may be overridden in shadowing
tasks. In a series of experiments investigating problematic areas for Polish learners of English,
Rojczyk (2012, 2013) and Rojczyk et al. (2013) found evidence of phonetic convergence in
VOT, formant frequencies of the vowel /æ/, and the suppression of coda stop release. In the
current study, we seek to build on this research and investigate the extent to which shadowing
facilitates Polish learners’ suppression of vowel glottalization in hiatus sequences.
Participants
Twenty one Polish learners of English participated in the experiment. All were majors in
English studies at either the University of Silesia (UŚ) in Sosnowiec or Adam Mickiewicz
University in Poznań (UAM). The 21 students comprised two groups. The first group (N=12)
included first year students, while the second group (N=9) was made up of students in higher
years. None had spent more than three months in an English-speaking country.
The division into the two groups is motivated as follows. In the first year of English
studies in Polish universities, students receive intensive pronunciation instruction (2*90
minutes/week). In the second year, this instruction is reduced to one meeting a week, while
students in the third year and above no longer receive explicit instruction in pronunciation.
The first year group had completed one semester of instruction at the time of the recordings.
The higher years’ group had completed a minimum of three semesters. The first two
semesters of instruction concentrate on segmental features. Word boundary effects receive a
small amount of attention, but only in the second year. Thus, it may be assumed that the first
year group was unfamiliar with sandhi processes associated with English V#V sequences,
while the higher years’ group had some awareness of these phenomena.
Materials
Data were elicited from student productions of an English sentence list containing 26 V(r)#V
sequences.4 The list of target sequences may be found in Appendix 1. The stimuli also
contained fillers, as well as tokens used in a separate experiment. The recordings were made
in soundproof recording studios in the English departments at each of the two universities.
The studios were equipped with a computer monitor upon which the stimulus data were
presented. In the first task, subjects were instructed to read the sentence presented on a slide
in a Power Point presentation. In the second task, the presentation of the slide was
accompanied by an audio recording of one of five native speakers of English (2 US English
speakers, 3 British English speaker) producing the stimulus. Students were instructed to
repeat after the model voice, which in each case produced the V#V sequence without visible
glottalization. The imitation task was performed immediately after the reading task. The
4 Three of the tokens contained a Vr#V sequences in which ‘linking’ /r/ may be expected in non-rhotic dialacts.
One target token contained the context for ‘intrusive’ /r/.
9
students received course credit for their participation. A total of 1092 tokens were obtained
(26 items * 2 tasks * 21 speakers).
Analysis
The V#V tokens were inspected with the help of the Praat program (Boersma & Weenink,
2011) with the aim of establishing the likelihood of modal and glottalized realizations of the
target sequences. Despite variation in the realization of glottal events (cf. Redi & Shattuck-
Hufnagel, 2001), preliminary analysis allowed us to establish three categories for vowel
realization. These categories are illustrated in Figures 1-3. Modally voiced hiatus is
exemplified in Figure 1, showing the V#V sequence in saw everything.
Figure 2 provides an example of a glottalized token in the sequence stay out. The selected
portion of the display reveals a creaky voice quality that is visible as a period of irregularity in
the periodicity of the vowel wave.
In Figure 3, we see a full glottal stop in the production of English see all. The selected portion
of the display is characterized by silence associated with full glottal closure.
10
Hillenbrand & Houde (1996) revealed that local drops in pitch and amplitude of modally
produced VV tokens may be perceived as glottal stops. Our tokens included a small
percentage of such cases. However, because of difficulties involved in the quantification of
such drops in amplitude and pitch, such tokens were coded simply as modal. Thus, a small
number of the tokens categorized as modal may have been realized with a feature associated
with the perception of glottalization. The annotation was performed by three out of the four authors of this paper. Of the 21
speakers, 19 were analyzed by a single annotator. The other two speakers were analyzed by
all three annotators to calibrate the analysis and check for consistency. With regard to the
coding of the initial vowels, the calibration analyses revealed a high level of agreement in the
case of tokens coded as modal (unanimous agreement in 90.5% of cases; Fleiss’ Kappa = 0.78
for two categories, modal vs. non-modal). When all three categories were included, some
discrepancies arose in the choice between tokens coded as glottalized and those containing a
full glottal stop (complete agreement in 61% of cases; Fleiss’ Kappa = 0.54 for the three
categories). Tokens for which two out of three annotators agreed were assigned to the
category selected by the majority. In the rare cases in which the three annotators assigned
tokens to three different categories, the first author made the final decision.
Results
The first results we will present are focused on the reading task, with the aim of characterizing
the participants’ level of acquisition of modal-voiced hiatus at the time of the experiment. The
vowel realizations for the reading task for both learner groups are summarized in Figure 4.
The 1st year group produced modal hiatus in 35.6% of the tokens, compared to 47% for the
higher years’ group. The 1st year group also produced a noticeably higher percentage of full
glottal stops (34.6% vs 22.2%). Two binary logistic regression analyses were carried out on
the reading data (for the full tables of all the regression analyses, see Appendix 2). The first
analysis examined modal tokens (yes or no) as the dependent variable while the second
analysis looked at full glottal stops (yes or no). In both cases, learner group, lexical frequency,
5 and sex were posited as covariates. The results revealed that learner group (higher years; B=
.440, p=.016) and lexical frequency (B=.675; p<.001) were significant predictors of modal
voice in hiatus tokens. The second analysis, with glottal stops as the dependent variable,
revealed that the higher years’ group was less likely to produce full glottal stops (B= -1.128,
p=.001). Significant effects were also found for female speakers (B=1.059, p=.03) and higher
frequency words (B= -.653, p=.001).
5Token frequency was coded on a three point scale according to the Francis-Kucera corpus. Values below 120
were categorized as ‘least frequent’, from 121-240 was ‘more frequent’, and above 240 was ‘most frequent’.
11
Figure 4 – Reading task results for the two learner groups
A graphic summary of the results for the imitation task across learner groups is given in
Figure 5. Again, two binary logistic regression analyses were carried out with modal voice
and glottal stops as dependent variables, and the same covariates as in the previous analyses.
The advanced group was again more likely to produce modal tokens (76.9% vs. 68.9%; B=
.482, p= .02), however the constant was not significant (B= .369, p=.149), suggesting that the
imitation task attenuated the differences between learner groups with respect to the likelihood
of modal realizations. With regard to glottal stops, all of the covariate effects (frequency,
learner group, sex) were significant. The higher years’ group was less likely to produce glottal
stops in the imitations (B= -1.128, p=.001). Glottal stops were also less likely in higher
frequency words (B= -.653, p=.001), and more likely for female speakers (B= 1.059, p=.03).
Figure 5 Imitation task results for the two learners groups
We now turn to a presentation of the results in which we compare the effects of the
independent variables only with regard to the percentage of modal voiced tokens, which are
considered as the target realizations of the hiatus sequences. In other words, the following
tables may be said to assess the likelihood of native-like realization of the word-initial vowel.
Overall 56.4% of the tokens across learner groups and tasks were realized with modal voice.
0,00%
5,00%
10,00%
15,00%
20,00%
25,00%
30,00%
35,00%
40,00%
45,00%
50,00%
G-Stop Glottalized Modal
1st yr
higher years
0,00%
10,00%
20,00%
30,00%
40,00%
50,00%
60,00%
70,00%
80,00%
90,00%
G-Stop Glottalized Modal
1st year
Higher years
12
Figure 6 presents the percentage of modal tokens for the two tasks across both learner groups.
In the reading task 41% of the hiatus tokens were produced with modal voice, while this rate
increased to 73% in the imitation task.
Figure 6 – Percentage of modal voiced tokens across both learner groups
Figure 7 separates this result in terms of learner group. We can see a slight tendency for
higher rates of modal realization for the advanced group across tasks. The difference between
the two groups was greater in the reading task (47% vs. 35.6%) than in the imitations (77%
vs. 68.9%).
Figure 7 – Percentage modal tokens for tasks and group
Figure 8 shows the percentage of modal tokens as a function of token frequency across learner
groups and task. We can observe a pattern by which more frequent vowel-initial words are
more likely to be realized with modal voice
41%
73%
0%
10%
20%
30%
40%
50%
60%
70%
80%
Reading Repetition
%Modal Voice
0,00%
10,00%
20,00%
30,00%
40,00%
50,00%
60,00%
70,00%
80,00%
90,00%
read imitate
1st years
Higher years
13
Figure 8 – Percentage of modal tokens as a function of lexical frequency of vowel-initial word
With task was included as a covariate, a binary logistic regression analysis with %Modal as
dependent variable revealed that task (imitation; B = 1.462, p < .001), lexical frequency (B =
.539, p < .001), and learner group (Advanced; B = .462, p = .001) all were significant
predictors of modal productions, while sex was not (Female; B= -.231, p=.151).
Finally, Figure 9 presents the number of modal tokens (maximum 26) for each
individual speaker in the reading and imitation tasks. As might be expected, in each case the
imitation task elicited more modal productions.
Figure 9 – Individual results for modally produced hiatus tokens
Discussion
The results of the production study may be summarized as follows. The likelihood of target-
like modal realization of V#V sequences was greater for the higher years’ group, which
suggests that the acquisition of modal hiatus is facilitated to some extent by the pronunciation
0%
10%
20%
30%
40%
50%
60%
70%
80%
leastfrequent morefrequent mostfrequent
LexFreq %Modal
0
5
10
15
20
25
30
read
imitate
14
instruction these learners had received at their universities. We attribute vowel glottalization
to L1 phonological interference. Modal hiatus was also facilitated by the lexical frequency of
the vowel-initial word, as well as the imitation task, and to a small degree the sex of the
speaker. These additional results, which may be claimed to be attributable to non-
phonological factors, present a fundamental theoretical challenge: to tease apart the
phonological and non-phonological aspects of hiatus glottalization. We shall discuss each of
these factors in turn. The lexical frequency effect may be conceptualized in terms of Lindblom’s (1990)
H&H theory, according to which speaker behavior is governed by the need to minimize
articulatory effort, while at the same time producing sufficient discriminability. Glottalization
is perceptually salient and increases discriminability, yet entails increased effort of additional
laryngeal gestures. According to Lindblom, higher frequency words will be recognized even
with reduced discriminability, so these words are still recoverable with modal hiatus that is
produced with lesser articulatory effort. The next result we take up is the effect of the imitation task. While shadowing elicited
a much higher rate of target-like modally voiced hiatus realizations, the task attenuated to
some extent the difference in performance between the two learner groups. This result could
perhaps be interpreted as an indication of greater effects of repetition in early stages of
acquisition. That is, the 1st year group showed greater improvement in their imitations. In
accordance with previous studies involving shadowing, it is possible that phonologically-
induced perceptually constraints are overridden by the task. The greater impact of imitation
for 1st year students may also find a parallel in novelty effects found for early learners (e.g.
Chang, 2012). A related study on C#V sequences in Polish English (Schwartz et al., 2013)
found that imitation had a larger impact on first year learners’ productions. The implications
of these findings, however, are not entirely clear. It could be that the first year group showed
greater improvement simply because they had more ‘room’ to improve. The final variable that we examined was the sex of the speaker. In American English,
studies have shown that females have a greater tendency to use creaky phonation for non-
linguistic purposes (Yuasa 2010). In our study, females were somewhat less likely to produce
modal-voiced hiatus, and were more likely to produce full glottal stops. However, on the basis
of our results, we cannot make any firm conclusions with regard to the use of non-modal
phonation as a function of sex.
The factors affecting the likelihood of glottalization in our study, in particular lexical
frequency and the shadowing task, are compatible with usage-based or exemplar-based
models of phonology and speech perception (Johnson, 1997; Bybee, 2001; Pierrehumbert,
2001). In such models, episodic memory traces play an instrumental role in the formation of
phonological categories such as segments and syllables. These categories are not primitives,
but rather emergent properties that form on the basis of auditory input. An important
assumption of exemplar theory is that gradient phonetic detail is recorded in each exemplar of
a category, and is therefore an integral part of category representation. This is to be expected
given the role of episodic memory in the hypothesized mechanism of category development.
Assuming the validity of the exemplar approach, one may therefore suggest that the putative
extra-grammatical effects observed in the present study are indeed phonological. The question
that remains concerns the compatibility of exemplar models with the representations proposed
upon which our research hypothesis is based. One issue that exemplar theory has not, to our knowledge, explicitly addressed is the
representational nature of emergent phonological categories. In other words, it is not stated
what the exemplars are exemplars of. The Onset Prominence model seeks to fill this gap. The
OP environment is derived from a hierarchy of phonetic events that is an integral part of the
auditory input for every user of spoken language. In this input, the phonetic properties
15
associated with manner of articulation provide auditory structure, providing a locus for
gradient phonetic details associated with place and laryngeal features. We suggest that the OP
representational hierarchy is indeed compatible with exemplar-based models. The benefit of
these representations for exemplar theory lies in the fact that they provide specific predictions
about the relative perceptual role of different types of gradient phonetic detail. In the case of
place and laryngeal cues, phonetic detail defines boundaries between categories. By contrast,
the perception of manner appears to be privative in nature. For example, a stop may be
produced with incomplete closure (Crystal and House, 1988), but this does not necessarily
imply the presence of sufficient frication noise to induce the perception of a separate manner
category (fricative). While questions about the phonological status of hiatus realization constitute one part
of our study, there is an additional important question that must be considered. In particular,
the glottalization of vowels has not been the focus of a large number of L2 studies,
presumably since its phonological status is not universally acknowledged. In a related study
of C#V sequences in the speech of Polish learners of English, Schwartz et al. (2013) looked at
the interaction between glottalization and final devoicing. Final devoicing, of course, is an
‘established’ feature of Polish-accented English and a major focus of pronunciation
instruction. Glottalization of hiatus, on the other hand, is not so intimately connected with
familiar segmental errors. Thus, the consequences of hiatus glottalization for the perception of
L2 English are not obvious. We seek to address this question by means of listening tests
described in the following section.
4 Experiment 2 - Accentedness and comprehensibility of glottalized and modal hiatus
The aim of this experiment was to assess the effect of vowel hiatus realization on the
perceived comprehensibility and accentedness of non-native speech (cf. Derwing & Munro,
1997; Munro, 2008). To explore the relationship between these variables, two separate on-
line listening tests were carried out. In the first, directed at L1 speakers of English, listeners
rated the foreign accentedness of glottalized and modal tokens. In the second, distributed to
Polish learners of English, participants rated the comprehensibility of glottalized and modal
tokens.
4.1 Participants and stimuli
Two groups of participants took part in the experiment:
1. 31 native speakers of Polish (for the comprehensibility test)
2. 41 native speakers of English (for the accentedness test) Data on the participants’ knowledge of English, exposure to non-native accents of English as
well as other known languages were collected, but proved not to have any significant effects
in relation to the aims of the study and therefore will not be reported on. The experimental stimuli were taken from recordings obtained in the production study.
Each stimulus consisted of utterances of between three and seven words. The total number of
tokens used was 36. Out of these, 12 pairs (i.e. 24 tokens) contained non-native utterances
with vowel hiatuses. That is, each member of the pair was the same utterance, but one was
produced with a modal realization of the hiatus, while the other had a glottalized realization.
The tokens were checked by two professional pronunciation teachers for additional
segmental errors. Four glottalized tokens were judged to contain additional pronunciation
errors, while three tokens with modal realization were found to exhibit additional errors.
Appendix 3 provides a summary of the utterance pairs and errors identified. Aside from the
16
12 pairs, the stimuli contained six additional tokens of native English utterances taken from
target recordings used for the repetition task in the production study. Finally, six other non-
native without vowel hiatuses were used to act as distractors.
4.2 Procedure
Two perception surveys were created for the purposes of the study. One survey was designed
to study how easily native speakers of Polish would comprehend the stimuli, while the other
examined the degree of accentedness perceived by native speakers of English. The surveys
were built using the Google Forms software and were additionally customized by hand to
include audio files and additional data validation. The surveys were made available online and
distributed using their URL addresses. Cooke et al. (2013) discuss the merits and drawbacks
associated with web-based speech perception studies. We feel that the merits outweigh the
drawbacks. One advantage is the highly voluntary nature of the participants. Another
advantage is the feasibility of recruiting a large number of L1 English listeners in a non-
English speaking country, particularly those who do not have everyday experience with
Polish-accented English. Nevertheless, we are in the process of attempting to replicate this
study in an on-site experiment. Both surveys consisted of two parts. The first part was comprised of a short linguistic
background questionnaire that was followed by the second part in that included perception
tasks. Each perception task contained a recording and a question to answer, i.e. ‘Which of the
descriptions below best describes the speaker's pronunciation?’. The task was to answer the
question by choosing one of the five available options. In the comprehensibility survey, the
listeners had to rate how well they understood the utterances (5 – very easy to understand; 1 –
very difficult to understand). In the accentedness survey, the listeners rated the degree of non-
native accent in the utterances (5 – clearly native; 1 – clearly foreign). All descriptions and
instructions in both surveys were delivered in English.
While completing the surveys listeners could not compare two recordings to each
other. Only one question could be viewed at a time and the participants could not continue to
the next question without first answering the current one. It was also impossible to return to a
previously answered question. The participants could only rely on their own impressions
when making perception judgments. In addition, each survey attempt was unique in the sense
that the order of all perception tasks was randomized when the survey was accessed. This
helped limit the effect of question order on the responses. As mentioned above, the experiment was conducted via the internet. Consequently, the
participants completed the tasks remotely on their own computers. The surveys were
distributed among the participants using the following URL addresses:
http://goo.gl/uWNDGZ (comprehensibility study) http://goo.gl/Hsypq7 (accentedness study)
Each participant was instructed to complete the survey only once and to use headphones to
listen to the stimuli in order to ensure sound fidelity. All participants were unaware of the
purpose of the experiment.
When participants accessed the surveys they were first asked to provide information
about their linguistic background and then they moved on to the experiment proper that
consisted of 36 perception tasks. No indication of this division was given to the participants. When completing the surveys, the participants were presented with only one question at a
time. In each question, the participants were instructed to do the following:
1. Click on a play button and listen to a short recording of an utterance. The recordings
could be replayed if needed.
17
2. Answer the question by choosing one of the five available options numbered 1–5. The
participants could not proceed without first answering the current question. No
feedback was presented for answering a question. 3. Click on ‘next’ to continue to the next question.
The participants continued until they answered all the questions. At this point, the ‘submit’
button became active and the listeners could click on it to send their responses. A
confirmation screen concluded the experiment.
4.3 Data Analysis
A total of 2628 responses were obtained in the experiment (comprehensibility survey: 36
tokens * 31 listeners = 1116; accentedness survey: 36 tokens * 42 listeners = 1512). The 438
responses to filler tokens without vowel hiatuses were excluded from the analysis. The
remaining 2190 responses (comprehensibility = 930; accentedness = 1260) were used to
calculate the mean values of comprehensibility and accentedness ratings for each participant.
The means were then analyzed by taking into account the following factors: Realization (whether the hiatuses in the utterance tokens were produced with a non-
native glottal or modal realization or were produced by a native speaker, i.e. had a
native modal realization) Other errors (whether the stimuli contained other errors that could influence the rating
or not) The mean comprehensibility and accentedness rating values for all stimuli types (glottal vs.
modal vs. native) were calculated and compared using a repeated measures ANOVA to
determine the effect vowel hiatus realization (glottalized vs. modal vs. native) on
comprehensibility and perceived accentedness of speech. In addition, the same means and
statistical analyses were obtained for tokens with and without other pronunciation errors in
order to evaluate the effects of these errors on the listeners’ judgments.
4.4 Results
The mean values of the perceived comprehensibility and accentedness ratings for all three
stimuli types are presented in Figure 10. The mean value of the comprehensibility rating was
almost exactly the same for glottalized tokens (M = 4.21, SD = .47) and modal tokens (M =
4.22, SD = .41). Slightly higher comprehensibility ratings were assigned to tokens produced
by native speakers of English (M = 4.51, SD = .48). As for the mean values of the
accentedness rating, the lowest score was found for glottalized (M = 2.23, SD = .47), followed
by modal tokens (M = 2.96, SD = .52). Tokens produced by native speakers obtained the
highest scores for accentedness (M = 4.51, SD = .56).
18
Figure 10. Mean ratings of perceived Comprehensibility (left) and Accentedness (right) of
glottalized, modal and native tokens.
A repeated measures ANOVA was conducted with the mean Comprehensibility and
Accentedness as the dependent variables and the hiatus realization (glottalized vs. modal vs.
native) as the independent variable. The results of the analysis are discussed below. For the perceived comprehensibility rating means (N = 31), Mauchly’s test revealed
that the assumption of sphericity had been violated, χ2(2) = 31.12, p < .001. Consequently,
degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity (ε =
.60). The results demonstrate a significant effect of hiatus realization on perceived
comprehensibility of an utterance, F(1.20, 36.18) = 9.78, p = .002. A post-hoc Bonferroni test
indicated that the perceived Comprehensibility was significantly different between glottalized
and native tokens, p = .02, as well as between modal and native tokens, p = .001. No
statistical difference was observed between glottal and modal tokens, p > .05. These results
suggest that Polish listeners found all non-native utterances equally comprehensible,
regardless of whether they were produced with a glottalized vowel hiatuses with modal vowel
hiatuses. However, both types of non-native utterances were less comprehensible than
utterances produced by native speakers of English. The analysis of the means of accentedness ratings (N = 42) also showed violation of
sphericity in Mauchly’s test, χ2(2) = 31.36, p < .001. In this case, degrees of freedom were
also corrected using Greenhouse-Geisser estimates of sphericity (ε = .65). A significant effect
of hiatus realization on perceived accentedness of an utterance was found, F(1.30, 53.13) =
257.25, p < .001. Significant differences between all pairs of token types were found in post-
hoc Bonferroni tests, p < .001. The results indicate that when L1 English listeners were asked
to assess the degree of non-native accentedness of utterances, judgments were affected by the
way vowel hiatus was realized. Specifically, the results suggest that non-native utterances
with glottalized vowel hiatuses are perceived as the most foreign by native speakers of
English, followed by non-native utterances with modal realizations. Utterances produced by
native speakers were perceived as the least foreign. Since some of the non-native tokens were judged to contain additional pronunciation
errors, we carried out an additional analysis to investigate whether the results reported above
were indeed due to vowel glottalization rather than other factors. Specifically, mean values
were obtained for non-native glottalized and modal tokens with and without other errors to
determine whether the same trends would be identified for utterances without errors and to
explore the effect of other errors on listeners’ judgments.
The mean values of the perceived comprehensibility and accentedness ratings for
glottalized and modal tokens with and without other are shown in Figure 11Figure . As far as
0,0
0,5
1,0
1,5
2,0
2,5
3,0
3,5
4,0
4,5
5,0
Glottalized Modal Native
mea
n C
om
preh
en
sib
ilit
y r
ati
ng
0,0
0,5
1,0
1,5
2,0
2,5
3,0
3,5
4,0
4,5
5,0
Glottalized Modal Native
mea
n A
ccen
ted
ness
ra
tin
g
19
the comprehensibility rating is concerned, in the case of tokens without other errors similar
means were observed for both glottalized tokens (M = 4.44, SD = .45) as well as modal tokens
(M = 4.30, SD = .43). The same can be said about tokens with other pronunciation errors i.e.
the differences between the means for glottalized (M = 3.62, SD = .68) and modal tokens (M =
3.81, SD = .65) were small. However, both were lower than the means of tokens without
errors. With regard to accentedness rating, glottalized tokens without other errors were given
much lower scores (M = 2.29, SD = .53) than modal tokens without pronunciation errors (M =
3.13, SD = .51). When both token types contained other pronunciation errors, the difference
was smaller but still noticeable (glottalized: M = 1.92, SD = .57; modal: M = 2.35, SD = .66).
Figure 11.Mean ratings of perceived comprehensibility (left) and accentedness (right) of
glottalized (dark grey) and modal (light grey) tokens.
The statistical analysis involved a paired-samples T-Test being carried out for
measures of comprehensibility and accentedness with a number of within-subject conditions.
The values for tokens without other pronunciation errors were compared in order to confirm
whether the same tendencies could be attested as for all analyzed tokens. The test indicated
that the difference between glottalized (M = 4.43, SD = .49), and modal tokens (M = 4.30, SD
= .43) was not significant for the measure of comprehensibility, t(30) = -2.03, p > .05.
However, in the case of accentedness rating, significant differences between glottalized (M =
2.29, SD = .53) and modal tokens (M = 3.13, SD = .50) were found, t(41) = 13.07, p < .001.
These results match the ones reported above, confirming that glottalized utterances were
perceived as more foreign accented by native speakers even when there were no other
pronunciation errors involved. The same analyses were conducted for tokens with other pronunciation errors. This
time, significant differences were found between glottalized (M = 3.62, SD = .68) and modal
tokens (M = 3.81, SD = .65) for comprehensibility ratings, t(30) = 2.29, p = .029. These
results suggest that pronunciation errors were more prominent in glottalized tokens, leading to
difficulties in comprehensibility. That is, there may be a correlation between the production of
glottalized hiatus and the production of other errors that influence comprehensibility. In the
case of accentedness, significant differences were found between glottalized (M = 1.92, SD =
.57) and modal tokens (M =2.35, SD = .66), t(41) = 4.00, p < .001. This implies that vowel
0,0
0,5
1,0
1,5
2,0
2,5
3,0
3,5
4,0
4,5
5,0
without other error(s) with other error(s)
mea
n C
om
preh
en
sib
ilit
y r
ati
ng
0,0
0,5
1,0
1,5
2,0
2,5
3,0
3,5
4,0
4,5
5,0
without other error(s) with other error(s)
mea
n A
ccen
ted
ness
ra
tin
g
20
glottalization continued to play a crucial part in in increasing the foreign accentedness of even
in comparison to modal tokens containing other foreign accent features. The effect of the glottalization of vowel hiatus on foreign accent perception is further
evidenced when the scores for tokens without other errors and for tokens with other
pronunciation errors are compared. A paired samples T-Test demonstrated a significant
difference in comprehensibility between glottalized tokens without other errors (M = 4.44, SD
= .45) and modal tokens with pronunciation errors (M = 3.81, SD = .65), t(30) = 8.08, p <
.001. Errors in modal tokens seemed to significantly impact the ease of understanding the
utterances and could be therefore interpreted as being more prominent. However, the same
test revealed that the difference between mean accentedness scores for glottalized tokens
without other errors (M = 2.29, SD = .53) and modal tokens with errors (M =2.35, SD = .66),
were non-significant, t(41) = -.62, p > .05. These results indicate that while the errors in
modal tokens could have impacted comprehensibility, the glottalization of vowel hiatus, even
alone, was just as indicative of foreign accents as the pronunciation errors in modal tokens.
4.5 Discussion
On the whole, the results of the listening tests revealed greater consequences of glottal vs.
modal hiatus for accentedness ratings than for comprehensibility ratings. This is seen in
significant difference in listener ratings between modal and glottal tokens for accentedness,
but not for comprehensibility. Indeed, the high scores for comprehensibility suggest a ceiling
effect by which all the tokens were relatively easy to comprehend. The relatively high
proficiency in English of the Polish speaking respondents probably contributed to this result.
Only three of the Polish participants rated their level of English as intermediate or below, the
remainder were advanced or proficient. In the future we are planning identification and
discrimination tasks to test intelligibility, which we feel will better assess the perceptual
consequences of vowel hiatus realization (see Munro, 2008).
Despite the apparent ceiling effect, there are a few interesting patterns to be observed
in the comprehensibility ratings. The first is that the native tokens were rated as the easiest to
understand. One might expect modally produced hiatus to have detrimental effects on the
comprehensibility of native speech for non-native listeners. English sandhi processes are
associated with a less salient word boundary. According to Cebrian (2000), L2 learners are
subject to a Word Integrity constraint, by which they are predicted to be relatively insensitive
to L2 sandhi processes. Assuming the validity of the Word Integrity hypothesis, the
comprehensibility ratings for the native tokens may come as a surprise. It is possible that the
high proficiency level of the respondents is an indication that they have overcome the Word
Integrity constraint, and the high ratings reflect the absence of other segmental errors.
Alternatively, the Word Integrity hypothesis needs to be reassessed (see Zsiga, 2011 for
evidence and arguments against Word Integrity). Although the differences in comprehensibility ratings between modal and glottalized
tokens were not significant overall, in the case of tokens containing additional segmental
errors, those with glottalized hiatus were rated as more difficult to understand than those with
modal hiatus. While the presence of other errors makes it impossible to claim that
glottalization was the primary factor influencing the ratings, this result could suggest an
overall correlation between glottalization and the presence of other segmental errors in the
speech of L2 learners. Further work is underway to investigate this possibility. With regard to accentedness ratings, our results point to a more direct effect of hiatus
glottalization. Overall glottalized tokens were rated lower than modal tokens (2.23 vs 2.96).
These differences held in tokens both with and without other segmental errors. Interestingly,
the magnitude of this difference was greater in the case of the tokens without additional
21
errors. Such tokens may be claimed to have isolated the effects of the glottalized hiatus
realization. Finally, in the accentedness ratings, modal tokens with other errors were judged
the same or even slightly more native-like (2.35) than glottalized tokens without other errors
(2.29). This finding suggests that L1 English listeners are particularly sensitive to target-like
boundary realization in making accentedness judgments.
5 General Discussion
We turn now the wider implications of our study for the study of L2 phonology. In particular,
we consider issues of phonological representation, investigating the question of ‘what’ is
acquired in L2 speech acquisition. Our experiments have explored a hypothesis that cross-
language differences in the representation of initial vowels govern the placement of a prosodic
boundary in V#V sequences in Polish and English, and that these differences may surface in
the speech of Polish learners of English. In the Onset Prominence framework, Polish initial
vowels are posited as inherently more prominent prosodic entities than initial vowels in
English. As a consequence, they show a greater tendency to be realized with glottal marking
that preserves the salience of the second vowel in V#V sequences. By contrast, in traditional
vernacular dialects of English, prosodically weaker vowels may be joined with the preceding
constituent and are realized without any changes in phonation type.
At this point we consider further implications of the representational differences we
claim underlie the realization of vowel hiatus in the two languages. The key parameter of
submersion was introduced in Section 2 in the context of vowel hiatus. In 5.1, we shall see
that this parameter may extend further into cross-language phonological comparison, unifying
descriptions of vowel hiatus, vowel lengthening, and the behavior of consonants in VC and
VCV contexts. In applying the representational devices of the OP environment to the study of
L1-L2 phonological interaction, we also note the implications of a framework in which
segments and prosodic constituents are constructed from the same representational materials.
Many riddles of interlanguage phonology appear to reside in the gaps between segmental and
prosodic interference. In 5.2 we apply the OP perspective to one such problem: perceptual
epenthesis after coda consonants by Korean learners of English (de Jong & Park, 2012).
5.1 Implications of submersion.
The process of submersion was introduced in (5), which illustrated how adjacent vowel
sequences may merge to form long vowels within words, as well as modal hiatus at word
boundaries. Submersion is a process with far-reaching prosodic implications. It is a form of
phonological recursion of the type proposed by van der Hulst (2010). In his view, coda
consonants may be seen as ‘syllables inside syllables’. In the OP environment, submersion
unifies this view of codas with the representation of long vowels. Additionally, submersion
offers insight into the behavior of consonants in both VC and VCV contexts, with deeper
predictions for the form and behavior of larger prosodic constituents. Let is consider first the behavior of coda consonants. In (6), we see a string of
segmental structures for the English word click. Due to its high position in the
representational hierarchy, the final /k/ may not undergo absorption into the preceding
constituent. To satisfy prosodic prohibitions on syllabic stops, the /k/ is submerged under the
preceding vowel. On the left we see the string of segmental structures, on the right we see the
syllabified form.
(6) Submersion of final /k/ in English click
22
Polish has borrowed the word click, reproducing it as klik, shown in (7).
(7) Polish klik (left) and English click
On the right, the final stop in English click is submerged under the preceding constituent. On
the left the final /k/ in Polish klik is not submerged. In Polish, restrictions against syllabic
stops are satisfied by adjoining the coda to the preceding constituent at a higher level of
structure. From the representations in (7), it is predicted that coda stops in Polish and English
should behave differently. Indeed, in the two languages there is a systematic difference in the
behavior of the final /k/ with regard to the release of the stop. In Polish the release is
obligatory (Dukiewicz & Sawicka 1995), while in English it is frequently suppressed. The
English coda is in a lower prosodic position in the OP hierarchy. Associating lower
hierarchical levels with prosodic weakness, we should expect the English coda to be subject to
lenition processes such as the suppression of release. Submersion appears to be obligatory in
English, as evidenced by the restriction that stressed syllables either contain a long vowel or
coda consonant. Submersion in English is not limited to individual segments – entire syllables may also
be lowered under the preceding constituent. In (8) we offer a representation of the English
word pity. This results in a larger ‘foot’ structure in which the onset to the second syllable is
underneath the first syllable. Such a configuration encodes a generalization sometimes
referred to as ‘ambisyllabicity’ (Kahn 1976), by which the second syllable ‘captures’ the coda
of the first, which is in a prosodically weak position. As a result, the second consonant in a
trochaic CVCV foot is subject to lenition (see Jensen, 2000; Harris, 2004). The /t/ in this word
is realized as a tap or a glottal stop in various vernacular dialects of English. A segmentally
‘similar’ word in Polish, PIT-y ‘tax returns’, would not contain submerged structure. The two
23
constituents would be adjoined at a higher level of structure, and the /t/ is not subject to
lenition processes. The larger foot structure in (8) may also find parallels in the relative
prominence of stressed and unstressed vowels in the two languages. English, of course, is
characterized by significant vowel reduction, while in Polish vowel reduction is less dramatic.
(8) English pity
In sum, our discussion points to an opposition between English and Polish with regard to
relations between individual segmental structures and the primitive CV tree in (1). English
offers the possibility of submersion, associated with long vowels, weak coda consonants, and
weak consonants in VCV contexts. Submersion is absent in Polish. Codas and intervocalic
consonants are generally not subject to weakening, and vowel quantity distinctions are absent.
The submersion parameter also has implications for boundary formation. Submersion of
initial vowels in English has prosodic motivation originating in the lack of VO specification.
The less prominent status of English initial vowels leads weakening processes by which
prosodic constituents do not always respect the boundaries imposed by individual lexical
items. By contrast, in Polish, VO specification and promotion serve to ensure that for the most
part, the terms ‘initial’ and ‘final’ as defined by lexical entries receive a more rigid
interpretation in relation to the phonological shape of words.
A crucial feature of the Onset Prominence model is that segmental representations and
prosodic constituents are constructed of the same structural materials, rather than being joined
by association lines. This perspective provides insight on the relationship between segments
and syllables, which has long been a source of debate in phonological theory (cf. cue vs.
prosodic licensing; e.g. Steriade, 1997), and persistent riddles in the study of L2 phonological
acquisition. In what follows, we apply the OP perspective to one such riddle: perceptual
epenthesis by Korean learners of English (de Jong & Park, 2012).
5.2 Reconciling models of L2 vowel epenthesis
The phenomenon of vowel epenthesis is extremely common among L2 learners, both in
production and perception. The apparent motivation for this process is to repair structures that
are phonotactically illicit in the L1. These may include clusters of consonants, as well as
consonants in coda position. Korean learners of English have received much attention in this
regard. Particularly puzzling is that fact that Korean speakers often epenthesize, both in
production and perception, after L2 coda stops, even though their L1 allows stops in coda
position. De Jong & Park (2012) studied perceptual epenthesis by Korean learners of English.
Their stated goal was to test the predictions of two explanations for epenthesis. According to
24
the first, which attributes epenthesis to ‘functional reparsing’, the process is motivated by the
need to license consonant manner and laryngeal contrasts in a new prosodic position.6 That is,
it serves to ease the task of identifying the consonant in question. In the other model, under
the heading of ‘perceptual misanalysis’, epenthesis is the reinterpretation of consonant release
as the ‘onset’ to an additional syllable, whose ‘nucleus’ is filled by an epenthetic vowel. De
Jong and Park note that authors who have argued for these models often claim that segment-
based and syllable-based explanations for epenthesis are incompatible. To test the two models De Jong & Park (2012) carried out a perceptual experiment in
which L1 Korean listeners engaged in two tasks with stimuli from American English: a
syllable counting task and a segment identification task. Their predictions with regard to the
aforementioned models were as follows. A negative relationship between segment
identification and syllable counting would provide support for the contrast-based model. That
is, more accurate segment identification comes at the cost of less accurate prosodic parsing,
since the additional syllable is claimed to aid in identification of the newly reparsed
consonant. No correlation between the two tasks would support the syllable-based model.
Finally, they posit that a positive relationship between the two tasks would support an
alternative model in which segmental identification and prosodic parsing are two aspects of
the same perceptual process. Their findings showed a positive correlation between syllable
counting and segmental features across individual subjects. Learners who were more accurate
in the syllable counting task (i.e. less likely to hear epenthetic vowels) tended to be more
accurate in identifying segmental features. Thus, it appears that neither of the two tested
models was supported. Rather, as they point out, ‘the present results suggest an overall model
. . . wherein listeners jointly interpret the details of segments and the syllabic position from an
integrated percept (De Jong & Park, 2012: 150)’.
The integrated percept proposed by De Jong & Park may be captured in the Onset
Prominence representational environment, in which segmental and syllabic structures are
constructed from the same hierarchy. Crucially, manner of articulation is a prosodic
specification rather than a segmental feature. Instead of a segmental specification attaching to
prosodic structure by means of association lines, manner is prosodic structure. Thus, any
percept of manner is inevitably integrated with prosody, as De Jong & Park propose. It is also
worth noting that manner accuracy showed the most robust correlation with syllable counting
accuracy in De Jong & Park’s experiment (2012: 145, Figure 4).
5.2.1 Coda adaptation and Korean phonotactic constraints
Epenthesis in the adaptation of English codas into Korean has been classified as a case of
‘unnecessary repair’ (Kang 2011), since L1 Korean does allow coda stops. However, Korean
codas are restricted in two important ways. First, the suppression of stop release is obligatory.
In addition, manner and laryngeal contrasts are neutralized. To provide some perspective on Korean phonotactic constraints, OP representations
for the three Korean labial stops are proposed in (9). Crucially, the framework allows for the
possibility that different melodic specifications may be housed at different levels of the OP
hierarchy in accordance with their phonetic realization. Place is specified as a [labial]
annotation on the Closure node. This is to be expected since it is the location of the closure
that defines stop place of articulation. By contrast, laryngeal features, whose phonetic
realization may be impeded by stop closure, may be assigned at lower levels. This is shown in
(9). Aspiration, which is of course associated with aperiodic noise, is shown as a [spread
glottis] specification on the Noise node. Tenseness is represented as a [constricted glottis]
6 Since Korean features unreleased stops in which place is encoded in coda positions, it might be expected that
Korean listeners should be quite accurate at identifying the place of articulation of coda consonants.
25
([cg]) annotation on the VO node. This feature is associated with a stiffer voice quality on the
onset of the following vowel (e.g. Ladefoged & Maddieson 1996), so the VO node is a natural
structural position for [cg] specifications. With these representations in mind, we may turn to
the representation of codas in Korean, which are characterized by the presence of place
contrasts, but neutralized manner and laryngeal contrasts.
(9) Proposed representations of Korean /p/, /pʰ/, and /p*/
In the OP environment, Korean codas may be attributed to submersion, producing the
structure in (10).
(10) Korean CVC sequence
To explain the fact that Korean maintains place contrasts in coda stops, but neutralizes
laryngeal constrasts, we need only propose a constraint against multiple layers of submerged
structure. That is, Korean, like many other languages, appears to restrict the size of syllable
rimes. The claim would be that Korean only allows a single node under the VT level. The
laryngeal specification, which as we saw in (10) are housed on lower-level nodes, loses its
structural housing and is not realized in this position. This is shown in (10) as the crossed-out
labels of the lower Noise and VO nodes. Thus, the apparent mismatch in Korean between a
licensed place contrast and neutralized laryngeal contrast may be explained as a single
constraint on the size of prosodic constituents.7 With the submerged Noise and VO nodes
eliminated in (10), it also falls out naturally that coda stops in Korean are always produced
without an audible release. The only remaining part of the stop is Closure; the lower nodes
associated with stop release have been eliminated.
Kang (2003) provides evidence that the presence of epenthesis in English loanwords in
Korean is closely related to the probability that the English coda stops are produced with an
audible release. Epenthesis is expected when the target language coda is more likely to
contain a release burst, which in Korean only occurs in syllable onsets. Thus, released stops
are adapted with an additional prosodic constituent whose prominence is enhanced by the
7 This restriction may capture the fact that coda fricatives are also realized as unreleased stops. Space restrictions
prevent us from explaining this in detail.
26
epenthetic vowel. Polish requires stop release in codas.8 In these cases we would claim that
these stops are not submerged under the preceding rhyme. Rather, they remain at their
underlying level of the OP hierarchy. Moreover, epenthesis after coda is standard in the
speech of Korean learners of Polish (Dziubalska-Kołaczyk, p.c.). It is not subject to the same
variability found in Korean English. Kang attributes the link between epenthesis in Korean English and target language
stop release to ‘perceptual similarity’, arguing for the importance of a non-contrastive
phonetic detail, stop release, in the mechanism of loanword adaptation. From the OP
perspective, stop release, though non-contrastive, is not merely a phonetic detail. It is
predictable on the basis of phonological parameters and constraints. The lack of release is
surely a systemic element of Korean phonology; it should be representable in phonological
terms. The OP environment offers phonological tools for the representation of such non-
contrastive phenomena. At the same time, OP representations allow for a unified description
of a range diverse phonological patterns. For example, the claim that submersion is absent
from Polish accounts for the following aspects of Polish phonology: the obligatory release of
coda stops, the lack of vowel length distinctions, the lack of lenition in VCV contexts, and the
tendency for glottalized realization of initial vowels.
6 Final remarks
This paper has presented acoustic and perceptual data on the realization of word-boundary
vowel hiatus in the speech of Polish learners of English. Since the theoretical perspective
from which our research hypotheses were formed lies outside the mainstream viewpoint of
most researchers in generative and applied linguistics, we shall conclude by offering some
clarification of the assumptions underlying our study.
The integration of segmental and prosodic structures in the OP environment is inspired
by the view of phonetic implementation espoused by Natural Phonology (e.g. Donegan and
Stampe 1979). In NP, phonological representations interact directly with speech, without an
intervening level of categorical phonetics. That is, ‘the feature and prosodic specifications of
the phonological representation are interpreted through the action of speaking, in an automatic
but highly contextually-determined way’ (Donegan, 2002: 58). This interpretation is
envisioned in terms of the application and suppression of phonetically-motivated
phonological processes. OP representations facilitate the explicit implementation of this idea,
allowing for more nuanced descriptions of phonetically-motivated phonological behavior than
those afforded by the standard principles of articulatory ease and perceptual clarity. Crucially,
instead of being thought of in linear segmental terms (Øʔ, Ø j, ʔ Ø, etc.), natural
processes are envisioned as adjustments to a hierarchical representation.
This is evident in the present study of vowel hiatus and hiatus resolution, which
suggests that learner success may be facilitated not only through the acquisition of L2
processes, but also through the suppression of L1 processes. Process application and
suppression are closely linked in the NP approach. At the same time, OP structures offer a
new and insightful view on the representational motivation and consequences of NP
processes. For example, vowel glottalization has clear motivation as a listener-friendly
fortition, yet beyond this characterization it has two different interpretations. In languages like
Polish, it preserves or strengthens a segmental structure that defines a prosodic boundary. By
contrast, in languages like English glottalization, when it occurs, adds to the structure of an
initial vowel to introduce a boundary. What is referred to as ‘glide insertion’ is subject to a
8 In current work, we are examining the acquisition of English unreleased stops by Polish learners of English.
27
different interpretation. It appears to be a perceptual process that is not reflected consistently
in production. In this sense it is similar to other perceptual processes in NP (Donegan and
Stampe, 1979), such as the perceptual denasalization process that occurs when English
speaking listeners hear the nasalized vowel in went [w nt] as an oral vowel. In production,
the modal hiatus that induces the glide percept is a speaker-friendly process insofar as it does
not require additional glottal constriction on the V#V sequence. At the same time, however,
the modal realization reflects submersion, a prosodically motivated process with deeper
implications for English phonology. Our study applies a new perspective on NP process application to the field of L2 speech
acquisition. We are not aware of other SLA research linking the glottalization of vowels or its
suppression to other aspects of segmental phonology. The main innovation in our approach
lies in abandoning the assumption that glottalization and modal hiatus/glide insertion are
separate processes. Rather we suggest that the various realizations of vowel hiatus reflect
different aspects of process application within a single representational hierarchy.
References
Babel, M. (2012). Evidence for phonetic and social selectivity in spontaneous phonetic
imitation. Journal of Phonetics 40: 177-189.
Best, C. T. (1995). A direct realist perspective on cross-language speech perception. In W.
Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 171-204). Timonium, MD: York Press.
Best, C. T., & Tyler, M. D. (2007). Nonnative and second-language speech perception:
Commonalities and complementarities. In M. J. Munro & O.-S. Bohn (Eds), Second
language speech learning – the role of language experience in speech perception and
production (pp. 13-34). Amsterdam: John Benjamins.
Boersma, P. & Weenink, D. (2011). Praat: doing phonetics by computer. [Computer
program].Version 5.2.18.
Britain, D. & S. Fox. (2008). Vernacular universals and the regularisation of hiatus resolution.
Essex Research Reports in Linguistics, 57, 1-42. Bybee, J. (2001). Phonology and Language Use. Cambridge: Cambridge University Press.
Cebrian, J. (2000). Transferability and productivity of L1 rules in Catalan-English
interlanguage. Studies in Second Language Acquisition, 22, 1-26. Chang, C. 2012. Rapid and multifaceted effects of second language learning on first language
speech production. Journal of Phonetics 40, 249-268.
Cook, A. 1991. American Accent Training. Hauppauge, NY: Barron’s Educational Series, Inc. Cooke, M., J. Barker, M. Lecumberri (2013). Crowdsourcing in Speech Perception, In M.
Eskenazi, G. Levow, H. Meng, G. Parent and D. Sundermann (Eds.), Crowdsourcing
for Speech Processing. 137-172. Hoboken, NJ: John Wiley and Sons.
Crosswhite, K. (2003). Spectral tilt as a cue to word stress in Macedonian, Polish, and
Bulgarian. In Sole, M.J., D. Recasens & J. Romero (eds.). Proceedings of XVth
International Congress of Phonetic Sciences, Barcelona, 767-770. Cruttenden, A. 2001. Gimson’s Pronunciation of English (6
th ed.). London: Arnold.
Crystal, T. & A.S. House. (1988). The duration of American English stop consonants: an
overview. Journal of Phonetics, 16, 285-294.
Davidson, L. and D. Erker. (To appear). Hiatus resolution in American English: the case
against glide insertion. Language.
28
Derwing, T. M., & Munro, M. J. (1997). Accent, intelligibility, and comprehensibility:
Evidence from four L1s. Studies in Second Language Acquisition, 19, 1-16.
Dogil, G. (1999). The phonetic manifestation of word stress in Polish, Lithuanian, Spanish,
and German. In H. van der Hulst (ed.), Word Prosodic Systems in the Languages of
Europe (pp. 273-311). New York: Mouton de Gruyter. Donegan, P. (2002). Phonological processes and phonetic rules. In: Future Challenges for
Natural Linguistics, ed. Katarzyna Dziubalska-Kołacyk and Jarosław Weckworth, pp.
57-81. (LINCOM Studies in Theoretical Linguistics 30.) Muenchen: LINCOM
EUROPA
Donegan, P. and D. Stampe. (1979). The study of natural phonology. In: Current Approaches
to Phonological Theory (Conference on the Differentiation of Current Phonological
Theories, Bloomington, Sept. 30-Oct. 1, 1977), ed. Daniel A. Dinnsen, pp. 126-173.
Bloomington: Indiana University Press
Downing, Laura. 1998. On the prosodic misalignment of onsetless syllables. Natural
Language & Linguistic Theory, 16, 1-52.
De Jong, K. & H. Park. (2012). Vowel epenthesis and segmental identity in Korean learners
of English. Studies in Second Language Acquisition, 34, 127-155. Dilley, L., S. Shattuck-Hufnagel & M. Ostendorf. (1996). Glottalization of word-initial
vowels as a function of prosodic structure. Journal of Phonetics, 24, 423-444.
Dukiewicz, L. & I. Sawicka. (1995). Gramatyka współczesnego języka polskiego – fonetyka i fonologia [Grammar of modern Polish – phonetics and phonology]. Krakow:
Wydawnictwo Instytutu Języka Polskiego PAN. Escudero, P. & Boersma, P. (2004). Bridging the gap between L2 speech perception research
and phonological theory. Studies in Second Language Acquisition, 26, 551-585.
Flege, J. and W. Eefting (1987). Cross language switching in stop consonant perception and
production by Dutch learners of English. Speech Communication 6, 185-202. Goldinger, S. (1997). Perception and production in an episodic lexicon. In Johnson, K.
Mullennix, J. W. (eds.). Talker Variability in Speech Processing (pp. 33-66). San
Diego: Academic Press. Goldinger, S. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological
Review, 105 (2), 251-279.
Gussmann, E. (2007). The phonology of Polish. Oxford: Oxford University Press. Hansen Edwards, J. & M. Zampini. Phonology and Second Language Acquisition.
Amsterdam: John Benjamins. Harris, J. (2004). Release the captive coda: the foot as a domain of phonetic interpretation. In
J. Local, R. Ogden & R. Temple (eds.), Phonetic interpretation: Papers in Laboratory
Phonology 6, 103-129. Cambridge: Cambridge University Press.
Hewings, M. and S. Goldstein. 1999. Pronunciation plus: Practice through interaction.
Cambridge: Cambridge University Press. Hillenbrand, J. and R. Houde. 1996. The role of F0 and amplitude in the perception of
intervocalic glottal stops. Journal of Speech and Hearing Research 39. 1182–1190
Honorof, D. N., Weihing, J., Fowler, C. A. (2011). Articulatory events are imitated under
rapid shadowing. Journal of Phonetics 39: 18-38. Howard, M. (2008). 'On the role of naturalistic and classroom exposure in the acquisition of
socio-phonological variation: A longitudinal study of French liaison'. Journal of
Applied Linguistics, 5 (2), 159-179. van der Hulst, H. 2010. A note on recursion in phonology. In H. van der Hulst, (ed.).
Recursion and Human Language, 301-342. Berlin: Mouton de Gruyter.
Jensen, J. (2000). Against ambisyllabicity. Phonology 17: 187-235.
29
Johnson, K. (1997). Speech perception without speaker normalization: an exemplar model. In
K. Johnson & J. Mullennix, (Eds.). Talker variability in speech processing (pp. 145-
166). San Diego: Academic Press. Kahn, D. (1976). Syllable-based generalizations in English phonology. PhD dissertation,
Massachusetts Institute of Technology.
Kang, Y. (2003) Perceptual similarity in loanword adaptation: English post-vocalic word-final
stops to Korean. Phonology 20 (2): 219-273.
Kang, Y. (2011) Loanword phonology. In van Oostendorp, Marc, Colin Ewen, Elizabeth
Hume, and Keren Rice, eds., Companion to Phonology. Wiley-Blackwell.
Ladefoged, P. and I. Maddieson. (1996). The Sounds of the World’s Languages. Oxford:
Blackwell.
Lecumberri, M. and J. Maidment (2000). English transcription course. Oxford: Routledge.
Levi, S. (2008). Phonemic vs. derived glides. Lingua, 118,1956-1978.
Lindblom, B. (1990). Explaining phonetic variation: a sketch of the H&H theory. In Speech
Production and Speech Modelling, edited by W. J. Hardcastle and A. Marchal (The
Netherlands: Kluwer Academic), pp. 403-439.
LLeo, C. & I. Vogel. (2004). Learning new segments and reducing domains in German L2 phonology: The role of the Prosodic Hierarchy. International Journal of
Bilingualism, 8, 79-104. Lujan, A.B. 2006. The American Accent Guide (2
nd ed.). Salt Lake City: Lingual Arts.
McCarthy, J. 1993. A case of surface constraint violation. Canadian Journal of Linguistics,
38.169-95.
Malisz, Z, M. Żygis & B. Pompino-Marschall. (2013). Rhythmic structure effects on
glottalisation: a study of different speech styles in Polish and German. Laboratory
Phonology, 4, 119-158. Marlett, S. & J. Stemberger. (1983). Empty consonants in Seri. Linguistic Inquiry, 14, 617-
639.
Munro, M. J. (2008). Foreign accent and speech intelligibility. In Hansen Edwards, J. G. &
Zampini, M. L. (Eds.). Phonology and Second Language Acquisition (pp. 193-218).
Amsterdam: John Benjamins.
Newlin-Łukowicz, L. (2012). Polish Stress: looking for phonetic evidence of a bidirectional
system. Phonology, 29(2), 271-329.
Pardo, J. S., Gibbons, R., Suppes, A. & Krauss, R. M. (2012). Phonetic convergence in
college roommates. Journal of Phonetics, 40, 190-197. Pierrehumbert, J. (2001) Exemplar dynamics: Word frequency, lenition, and contrast. In J.
Bybee and P. Hopper (eds.) Frequency effects and the emergence of lexical structure.
John Benjamins, Amsterdam. 137-157.
Plag, I., G. Kunter, & M. Schramm. (2011). Acoustic correlates of primary and secondary
stress in North American English. Journal of Phonetics,39, 362-374.
Prince, A. & P. Smolensky. 1993. Optimality Theory: Constraint interaction in a generative grammar. Rutgers University Optimality Archive.
Redi, L. & S. Shattuck-Hufnagel. (2001). Variation in the realization of glottalization in
normal speakers. Journal of Phonetics, 29, 407-429. Roach, P. 2009. English Phonetics and Phonology. Cambridge: Cambridge University Press
Rojczyk, A. (2012). Phonetic and phonological mode in second-language speech. VOT
imitation. Paper presented at EUROSLA 2012, Poznań, Poland, September 5-8 2012. Rojczyk, A. (2013). Phonetic imitation of L2 vowels in a rapid shadowing task. In Lewis, J &
LeVelle, K. (eds). Proceedings of the 4th Pronunciation in Second Language Learning
and Teaching Conference (pp. 66-76). Ames, IA: Iowa State University.
30
Rojczyk, A., Porzuczek, A. & Bergier, M. (2013). Immediate and distracted imitation in
second-language speech: Unreleased plosives in English. Research in Language, 11, 3-
18. Rubach, J. (2000). Glides and glottal stop insertion in Slavic languages – a DOT accoutn.
Linguistic Inquiry, 31 (2), 271-317. Schwartz, G. (2013a). Vowel hiatus at Polish word boundaries – phonetic realization and
phonological implications. Poznań Studies in Contemporary Linguistics, 49 (4), 557-
585. Schwartz, G. (2013b). A representational parameter for onsetless syllables. Journal of
Linguistics, 49 (3), 613-646. DOI: http://dx.doi.org/10.1017/S0022226712000436. Schwartz, G., A. Balas & A. Rojczyk. (to appear). External sandhi in L2 segmental phonetics
– final (de)voicing in Polish English. In Proceedings of New Sounds 2013.
Shockley, K., Sabadini, L. & Fowler, C. A. (2004). Imitation in shadowing words. Perception
and Psychophysics, 66, 422-429. Shoemaker, E. (2010). Nativelike Attainment in L2 Listening: The segmentation of spoken
French. In Dziubalska-Kołaczyk, K., M. Wrember & M. Kul (eds.) Proceedings of the
6th international symposium on second language speech, New Sounds 2010, Poznań,
Sluijter, Agaath M.C. & Vincent J. van Heuven. 1996. Spectral balance as an acoustic
correlate of linguistic stress. Journal of the Acoustical Society of America, 100, 2471-
2485.
Steriade, D. 1997. Phonetics in phonology – the case of laryngeal neutralization. Ms. UCLA.
Sturm, J. (2013). Liaison in L2 French: The effects of instruction. In. J. Levis & K. LeVelle
(Eds.). Proceedings of the 4th
Pronunciation in Second Language Learning and
Teaching Conference, Aug. 2013 (pp. 157-166). Ames, IA: Iowa State University.
Uffmann, C. 2007. Intrusive [r] and optimal epenthetic consonants. Language Sciences,
29.451-76.
Wells, J.C. Longman Pronunciation Dictionary (3rd
ed.). Harlow: Pearson Education Limited.
Wiese, R. (2000). The phonology of German. Oxford: Clarendon Press. Wright, R. (2004). Perceptual cue robustness and phonotactic constraints. In Hayes, B., R.
Kirchner and D. Steriade (eds). Phonetically Based Phonology. Cambridge: Cambridge University Press, 34-57.
Yuasa, I. (2010). Creaky voice: a new feminine voice quality for young urban-oriented
upwardly mobile American women? American Speech, 85, 315-37. Zsiga, E. (2003). Relearning consonant timing. Studies in Second Language Acquisition, 25,
399-432. Zsiga, E. (2011). External sandhi in a second language: The phonetics and phonology of
obstruent nasalization in Korean and Korean-accented English. Language 87. 289 - 345.
31
Appendix 1 – stimuli for production experiment
1-stay out 2-try out
3-see all 4-way out 5-go out 6-know everything 7-saw everything
8-play it 9-know Alex 10-how interesting
11-know after 12-grow excellent 13-try each 14-they always
15-go every day 16-they actually 17-car always 18-your idea
19-the other 20-the end 21-they asked
22-be afraid
23-tell me all 24-I often 25-sure I
26-holiday activities
32
Appendix 2 – Tables for binary regression analyses in production experiment
Reading task - %Modal
B SE Wald df Sig. Exp(B)
sex -,305 ,211 2,102 1 ,147 ,737
advanced ,440 ,183 5,791 1 ,016 1,553
TFreq ,441 ,106 17,375 1 ,000 1,554
Constant -,750 ,233 10,367 1 ,001 ,472
Reading task - %Glottal Stop
B SE Wald df Sig. Exp(B)
sex 1,059 ,489 4,691 1 ,030 2,884
advanced -1,128 ,355 10,099 1 ,001 ,324
TFreq -,653 ,194 11,388 1 ,001 ,520
Constant -2,238 ,489 20,976 1 ,000 ,107
Imitation task %Modal
B SE Wald df Sig Exp(B)
sex -,128 ,245 ,273 1 ,601 ,880
advanced ,489 ,209 5,445 1 ,020 1,630
TFreq ,675 ,126 28,530 1 ,000 1,964
Constant ,369 ,256 2,082 1 ,149 1,447
Imitation - %Glottal Stops
B SE Wald df Sig. Exp(B)
TFreq -,653 ,194 11,388 1 ,001 ,520
sex 1,059 ,489 4,691 1 ,030 2,884
advanced -1,128 ,355 10,099 1 ,001 ,324
Constant -2,238 ,489 20,976 1 ,000 ,107
33
Across tasks - %Modal
B SE Wald df Sig. Exp(B)
sex -,231 ,161 2,059 1 ,151 ,794
advanced ,462 ,138 11,219 1 ,001 1,587
Task 1,462 ,135 116,482 1 ,000 4,316
TFreq ,539 ,081 44,873 1 ,000 1,715
Constant -,909 ,188 23,452 1 ,000 ,403
34
Appendix 3 – Stimuli for listening tests, with additional segmental errors
Utterance Errors in glottal token Errors in modal token
and they asked me – –
I go every day – –
his car always breaks
down – short VOT in /k/,
devoiced /z/ in his
and always
I’ll know after the exam vowel quality in
know –
I’m sure I heard a
telephone ringing – ng in ringing
I often imagine things vowel quality, ng –
I saw everything – –
see all the pictures unaspirated /p/ vowel quality in all
why did you stay out all
night – –
they actually like quality of /æ/ in
actually
–
they always have
something prepared
– –
we’ll be unable to send – –