Predicting phonotactic difficulty in second language acquisition

Predicting phonotactic difficulty in second language

acquisition

Katarzyna Dziubalska-KołaczykAdam Mickiewicz University, Poznań

dkasia@ifa.amu.edu.pl

Predicting phonotactic difficulty in second language

acquisition

Katarzyna Dziubalska-KołaczykGrzegorz Krynicki

Adam Mickiewicz University, Poznańdkasia@ifa.amu.edu.plkrynicki@ifa.amu.edu.pl

Aim of the paper

to demonstrate thatuniversal phonotactic preferences

guide the acquisition of consonant clusters in a second language

Empirical evidenceyoung learners of English (L2 English) with

the following L1’s:– independent: Japanese, Korean, Vietnamese– Sino-Tibetan: Chinese– Austronesian: Kosraean, Marshallese,

Palauan, Ponapean, Samoan, Tagalog, Trukese, Visayan

– Dravidian: Tamil– Polish

Outline of the talk1. Hypothesis2. Description of the experiment3. Introduction to B&B phonotactics4. Phonotactic calculator5. Analysis of the selected data6. Preliminary conclusions

1. Hypothesis1. a degree of difficulty in pronouncing L2 clusters

would correlate with the universal characteristics of a given consonantal cluster

2. the more preferred a cluster, the easier and less susceptible to modifications it is expected to be

3. NAD is expected to be a universal criterion, underlying the performance of all subjects, and surpassing other relevant factors, such as the structure of the subjects’ mother tongue, their experience with English or their other capacities and motivations

degree of preference is measured by the NAD Principle

2. Description of the experiment 53 subjects; 15 subjects analysed here aged 11-13 native speakers of 15 various languages; 10

here recorded reading 83 times an English carrier

sentence I haven’t seen a xxx before! each time containing a different bi-syllabic nonce word

each word contained just one double or triple consonant cluster

all positions (initial, medial and final) and representative combinations were covered

Text for subjects

Read the following sentences aloud:

– I haven’t seen a kyati before!– I haven’t seen a shwepy before!– I haven’t seen a chluppy before!– I haven’t seen a katewt before!– I haven’t seen a petewm before!– …

a sound file demo

a Ponapean speaker (Micronesia)

3. B&B phonotacticsa universal model of phonotactics within

Beats & Binding Phonology (Dziubalska-Kołaczyk 2002) – a syllable-less theory of phonology embedded in Natural Phonology

intersegmental cohesion determines syllable structure, rather than being determined by it (if one insists on the notion of the ”syllable”)

B&B phonotactics the phonotactic preferences specify the universally

required distances between segments within clusters which guarantee, if respected, preservation of clusters (cf. intersegmental cohesion)

clusters, in order to survive, must be sustained by some force counteracting the overwhelming tendency to reduce towards CV's (CV preference)

this force is a perceptual contrast defined as NAD Principle (cf. Dziubalska-Kołaczyk 2002, 2003, Dressler & Dziubalska-Kołaczyk 2007, in press, Dziubalska-Kołaczyk & Krynicki 2007, Bertinetto et al. 2007)

B&B phonotactics the universal preferences specify the optimal

shape of a particular cluster in a given position by referring to the

Net Auditory Distance Principle (NAD Principle)NAD = |MOA| + |POA| + |Lx|

whereby MOA, POA and LX are the absolute values of differences in the Manner of Articulation, Place of Articulation and Voicing of the neighbouring sounds respectively

B&B phonotactics

Example:NAD (C1,C2) ≥ NAD (C2,V)

In word-initial double clusters, the net auditory distance (NAD) between the two consonants should be greater than or equal to the net auditory distance between a vowel and a consonant neighbouring on it.

Table of consonants

5laryngeal(glottal)

4radical

3dorsal

2coronal

1labial

semiVaffricate

Vapproximantsonorant stopfricativestop

sonorantobstruent01234

B&B phonotactics consider the preference for initial double clusters

NAD (C1,C2) ≥ NAD (C2,V) let us now define two Net Auditory Distances

between the sounds (C1, C2) and (C2, V) whereC1 (MOA1, POA1, Lx1) C2 (MOA2, POA2, Lx2)V (MOA3, Lx3)

& |MOA2 – MOA3| + |Lx2 – Lx3|

for (C2, V) cluster

B&B phonotacticsExample:

in CCV in E. tryt = (4, 2, 0), r = (1, 2, 1), V = (0, 0, 1)

NAD (C1, C2) = |4-1| + |2-2| + |0-1| = 3+0+1=4NAD (C2, V) = |1-0| + |1-1| = 1+0=1

thus, the preference NAD (C1,C2) ≥ NAD (C2,V)is observed because 4 > 1

NAD Principle makes finer predictions than the ones based exclusively on sonority: prV > trV, krV > trV, trV > drV, etc.

B&B phonotacticsthe universal NAD Principle leads to

predictions about language-specific phonotactics, its acquisition and change

specifically, it also allows to predict and explain the order of difficulty in the acquisition of second language phonotactics which appears to be universally valid and as such calls for similar remedies across languages

English frequent initial doubles according to NAD Principle

Selected Polish clusters according to NAD Principle

Cluster types in Polish acc. to NAD

5 4 3 3 3 3 2 2 2

11 3 3 4 4

43 0 0

-1 -1 -2 -2 -2

pr fr lv mʂ rd fk mb ʂk skMOA+POA+Lx C2V NAD

4. Phonotactic calculatorfor the purposes of B&B phonotactics,

Krynicki developed the phonotactic calculator

its purpose is to enable fine-tuning and developing the theory by statistical analysis of phonetic dictionaries and phonetically annotated corpora from various languages

Phonotactic Calculator - requirements

various cluster lengths at all word positions formulating phonotactic hypotheses feedback on predictability of a phonotactic hypothesis choice or customization of

available phone sets, features of each phone and scores for each feature

available phonetic dictionaries and languages (PolSynt, Festvox, Festival)

metrics used for calculating distances between phones (taxicab, euclidean)

accepted phonetic alphabets (IPA, SAMPA)

5. Analysis of the selected dataa total of 1245 utterancesproduced by 15 childreneach reading 83 sentences containing a

nonsense word with a 2- or 3-consonant cluster

in 767 of these utterances (61,6%) the speakers modified or avoided the cluster that was assumed to be the correct pronunciation of the nonsense word

Error types

error description

number of errors in the corpus symbol

vowel insertion between the elements of a consonant cluster or at the end of a cluster that was expected to be pronunced word-finally 234 @reducing the number of consonants in the cluster (from 3 to 2 or 1 and from 2 to 1) 218 $unintelligible pronunciation 154 ?substitution of consonant in a cluster by consonants not present in the expected cluster 152 #substantial mispronunciations 119 %pause insertion between the elements of a consonant cluster or at the end of a cluster that was expected to be pronunced word-finally 24 .deletion of the cluster 4 ∅omission of the word 2 omitted

total 907

Summary statistics for six preferences

preference number

number of cluster that apply to a given preference

number of clusters that follow the preference percentage

1 17 17 100%

2 13 13 100%

3 38 27 71%

4 5 3 60%

5 5 3 60%

6 5 2 40%

Part 1 of the hypothesisA degree of difficulty in pronouncing L2 clusters correlates with the

universal characteristics of a given consonantal cluster.

To a certain degree the amount of correlation between the number of errors students make when producing a cluster and the NAD parameters between the components of that cluster can be illustrated by means of cluster ranking in terms of their NAD differences and their difficulty.

Ranking of clusters can be performed first with respect to the NAD criterion and then with respect to linearly scaled percentage of clusters in which speakers made errors.

Although statistically not significant, the trend line indicates the expected direction of change and degree of slope between difficulty and NAD measure for finals.

correlation for final double clusters

y = -0,0302x - 2,5577R 2 = 0,007

y = -0,978x + 17,385R 2 = 0,92

t͡�

a r t͡

NA D(V C ) - NA D(C C )Inc orrec tL iniowy (NA D(V C ) - NA D(C C ))L iniowy (Incorrec t)

Part 2 of the hypothesis:Linear Regression

The more preferred a cluster, the easier and less susceptible to modifications it is.

The error of complex mispronunciation annotated in the corpus involved combination of other various errors, epenthesis, substitution, metathesis and other.

There is a significant correlation between the NAD differences in a word-medial cluster and the frequency of the complex mispronunciation errors made in it by the speakers (P-value in the ANOVA = 0,0282; R-squared = 11,2148%).

mispronunciation & medial clusters

Plot of Fitted Model

-9 -6 -3 0 3 6NAD_VC minus NAD_CC

Part 2 of the hypothesis:Analysis of variance and median

• NAD(VC) - NAD(CC) turns out to have statistically significant influence on the number of reduction errors students made in word-final clusters

Means and 95,0 Percent LSD Intervals

VC minus CC le mdn[VC minus CC]

reduction & word-final clusters

ANOVA F=11.86, p=0.006Kruskal-Wallis T=7,46, p=0,006

Difference↗

Preference↘

Part 3 of the hypothesisNAD is expected to be a universal criterion, underlying the performance of all subjects.

If a child produces a consonant cluster different from the expected one, this new cluster will usually follow phonotactic preferences (grand mean of 79.7% compared to 78.3% for expected clusters).

all initials medials finalsThe number of all expected consonant clusters 83

2290,9

4367,4

1888,9

%The number of expected consonant clusters that followed phonotactic preferences 65 20 29 16Total number of elicited consonant clusters 158

2253,6

4360,1

1858,3

%Total number of elicited consonant clusters that followed phonotactic preferences 126 12 26 11The number of cases when there was a 0 in the expected cluster but more than 0 in the elicited cluster 10 0 0 0

The number of cases when both the expected and the elicited cluster were a 1 37 9 20 8

The number of cases when both the expected and the elicited cluster were a 0 7 0 7 0

The number of clusters for which no speaker produced an qualifiable utterance. 22 10 14 8

This suggests that phonotactic preferences underlie the performance of the subjects of various linguistic backgrounds and may be universal.

More research is necessary to show whether the speakers of different languages displayed significant differences in their following of the preferences.

6. Preliminary conclusions universal phonotactic preferences guide

speakers in producing SL clusters the scale of preference in the acquisition of a

given type of cluster allows for fine-tuning of SL learning/teaching materials

many aspects of the analysis remain to be continued– comparison with the L1’s of the subjects– data from further subjects– detailed analysis of the errors: which types of

improvements are preferred

• f j a h l a t͡ ʃ j a t͡ ʃ w a k j a k r a p l a p w a ʃ w a k m a m j a t l a t͡ ʃ l a s r a l j a m w a t n a

• f j a h l a t͡ ʃ w a p w a ʃ w a t n a k j a k m a t͡ ʃ j a t͡ ʃ l a m w a t l a s r a l j a m j a p l a k r a

correlation for initial double and triple clusters

correlation for initial double clusters

y = -0,2586x + 5,5919R 2 = 0,9229

y = -0,1152x + 10,743R 2 = 0,0399

t͡�

ʃ j a

t͡�

ʃ w a

t͡�

ʃ l a

NAD(C C ) - NAD(C V )Inc orrec tL iniowy (NAD(C C ) - NAD(C V ))L iniowy (Inc orrect)

correlation for medial double clusters

y = -0,1754x + 4,2098R 2 = 0,917

y = -0,0928x + 9,2304R 2 = 0,0864

NAD(V C ) - NAD(C C ) Inc orrec t

L iniowy (NAD(V C ) - NAD(C C )) L iniowy (Inc orrec t)

Predicting phonotactic difficulty in second language acquisition

Documents

Carry’On’Wayward’Son!! ! ! Difficulty!= aaa Kansas · Carry’On’Wayward’Son!! ! ! Difficulty!= aaa! Kansas

Vocabulary size and phonotactic probability 1

Forensic Comparison and Matching of Fingerprints: Using ...kellmanlab.psych.ucla.edu/files/kellman_mnookin... · Rates through Understanding and Predicting Difficulty Philip J. Kellman

Spoken Arabic Dialect Identification Using Phonotactic

Beyond bigrams for surface-based phonotactic models: a ... · 1/41 Beyond bigrams for surface-based phonotactic models: a case study of South Bolivian Quechua Colin Wilson Gillian

Phonotactic Restrictions on Ejectives A Typological Survey ___________________________ Carmen Jany cjany@csusb.edu

Phonotactic restrictions and morphology in Aymara

Predicting phonotactic difficulty in second language acquisition Katarzyna Dziubalska-Kołaczyk Adam Mickiewicz University, Poznań dkasia@ifa.amu.edu.pl

Novel phonotactic learning: Syllable-level and co-occurrence representations?, Amélie Bernard

da4e1j5r7gw87.cloudfront.net...Ocular motor-related symptoms Difficulty locating objects Difficulty with gait Difficulty with balance Bumping into chairs, objects, etc. Difficulty

Modeling OCP-Place in Amharic with the Maximum Entropy phonotactic

Speech Errors, Phonotactic Constraints, and Implicit Learning: A

Predicting language-learning difficultycysouw.de/home/articles_files/cysouwPREDICTING.pdf · 3. Geography and genealogy The difficulty in learning a language is supposedly related

Phonotactic and Prosodic Effects on Word Segmentation in ...Luce/Pdfs/1999-MattysJusczykLuceMorgan.pdfinvestigated: Phonotactic regularity and prosodic pattern. The stimuli used in

VIOLATING THE PHONOTACTIC ... - Dalhousie University

Spoken Arabic Dialect Identiﬁcation Using Phonotactic · PDF fileSpoken Arabic Dialect Identiﬁcation Using Phonotactic Modeling ... The Arabic dialects, ... considered a sub-dialect

L1 phonotactic restrictions and perceptual adaptation: English

Learning Phonotactic Distributionsruccs.rutgers.edu/images/personal-alan-prince/papers/ap-bt-LPhDist...Learning Phonotactic Distributions ... problem is the subset principle ... Linguistic

Readability Formulas • A readability formula is any one of many methods of measuring or predicting the difficulty level of a text by analyzing sample passages

Predicting Game Level Difficulty Using Deep Neural Networks