19
Measures of Variation Among English and American Dialects Robert Shackleton U.S. Congressional Budget Office

Measures of Variation Among English and American Dialects

  • Upload
    dolan

  • View
    53

  • Download
    0

Embed Size (px)

DESCRIPTION

Measures of Variation Among English and American Dialects. Robert Shackleton U.S. Congressional Budget Office. Goals. Compare speech variants used by English and American speakers, using easily accessible data Use several different quantitative methods to assess variation among speakers - PowerPoint PPT Presentation

Citation preview

Page 1: Measures of Variation Among English and American Dialects

Measures of Variation Among English and American Dialects

Robert ShackletonU.S. Congressional Budget Office

Page 2: Measures of Variation Among English and American Dialects

Compare speech variants used by English and American speakers, using easily accessible data

Use several different quantitative methods to assess variation among speakers

Compare different quantitative methods

Use results to gain some insight into English origins of American speech variants

Goals

Page 3: Measures of Variation Among English and American Dialects

Data• Nearly all data from

Kurath & McDavid’s Pronunciation of English in the Atlantic States; some from Kurath’s Dialect Structure of Southern England

• All or nearly all data collected by Guy Lowman

• 82 phonemes classified into 285 variants by Kurath and McDavid

Page 4: Measures of Variation Among English and American Dialects

Data

• Four regions– Southern England (59 informants); settled <700

– Southeastern Massachusetts (22 informants); settled <1650

– S.E. Virginia / N.E. North Carolina (31 informants); settled ~<1690

– S.W. Virginia / S. West Virginia (19 informants); settled ~1750-1800

• Informants largely older, rural, long-settled families

• In some cases, more than one variant per informant• Some missing data• Some data arbitrarily attributed to one of two or

three possible informants in a given locality

Page 5: Measures of Variation Among English and American Dialects

• Shared variants: based on proportion of variants shared between two speakers

• Genetic distance: based on relative frequencies of variants, treating variants of a given phoneme as analogous with allelles of a given gene

• Linguistic distance: measured as a Euclidean distance between variants in an idealized geometric grid (e.g. ² and e are closer to each other than i and Þ)

• Each measure involves arbitrary assumptions• Choice of phonemes to include

• Classification of responses into variants

• Quantification of distances among variants

• Important difference: first two approaches assume that variants are discrete; linguistic approach does not

Methods

Page 6: Measures of Variation Among English and American Dialects

Genetic Approach

• Nei's genetic distance D measures how closely related populations of pronunciation patterns are if:– Change is always to a completely new variant– All phonemes have the same rate of change– Population sizes remain constant over time

• Occurrence of variant = 1; absence = 0• Occasionally, frequency of variant in a set of similar

words (0 < x < 1)• In some cases, more than one variant per speaker• Each informant represented by a vector of 285

numbers, each between 0 and 1• In this sample:

– D ranges from 0.00 to 1.70– 50% shared pronunciations => D = 0.7

Page 7: Measures of Variation Among English and American Dialects

Linguistic Approach

• Variants are characterized by a set of numbers representing degrees of height, backing, rounding, rhoticity, length

Unrounded RoundedBacking: 1 2 3 1 2 3Height:

7 j6 i i ¿ © y u¿ u5 ± ±¿ ¦ ¡ ¹ ¿ ¹4 e o

3.5 ° o¿3 ² ³ ¸ À2 á ¥ ∙1 a µ Þ ¶

Rhoticity1 2 3

° ´ r

j 7 1 1 1 ¸ 3 3 1 1u 6 3 2 1 À 3 2 2 1©6 3 1 1 ³ 3 2 1 1u¿6 2 2 1 ² 3 1 1 1i ¿6 2 1 1 o¿3.5 3 2 1i 6 1 1 1 r 3.5 2 1 3¹ 5 3 2 1 ´ 3.5 2 1 2¦ 5 3 1 1 ° 3.5 2 1 1¹ ¿5 2 2 1 ∙ 2 3 2 1±¿5 2 1 1 ¥ 2 2 1 1y 6 1 2 1 á 2 1 1 1¡ 5 1 2 1 ¶ 1 3 2 1± 5 1 1 1 Þ 1 3 1 1o 4 3 2 1 µ 1 2 1 1e 4 1 1 1 a 1 1 1 1

Page 8: Measures of Variation Among English and American Dialects

• Difference between variants measured as Euclidean distance

• Distance between two speakers LD measured as the average Euclidean distance between their variants

• Could also measure the dispersion of distances, etc.

• In this sample:– LD ranges from 0.00 to 1.68

– 50% shared variants => LD = 0.70 to 1.16

Linguistic Approach

Variant Height1 Back1 Round1 Rhotic1Height2 Back2 Round2 Rhotic2Map 144: ai in boiled 1.00 1.00 1.00 1.00 6.00 1.00 1.00 1.00Map 144: µ±¿~ Þ±¿~ ¥±¿in boiled 2.33 1.33 1.00 1.00 5.00 2.00 1.00 1.00Map 144: ¸± in boiled 3.00 3.00 1.00 1.00 5.00 1.00 1.00 1.00Map 144: variant of o± etc. in boiled 4.00 3.00 2.00 1.00 5.00 1.00 1.00 1.00Map 144: ∙i in boiled 2.00 3.00 2.00 1.00 6.00 1.00 1.00 1.00

Page 9: Measures of Variation Among English and American Dialects

Cluster Analysis

• Methods of grouping informants on the basis of similarity of their speech patterns

• Many different approaches– Different measures of similarity—Pearson correlations,

Euclidean distances, cosines, genetic or linguistic distances– Different methods of grouping similar observations into

clusters—single, average, and complete linkages, various algorithms for estimating phylogenetic relationships

• Results highly dependent on approach– English speakers tend to group into five regions (East

Midlands, East Anglia, Southeast, Southwest, Devonshire)– North American regions tend to be distinct, and to cluster

most closely with to Southeast England– EVNC and SWVA consistently cluster together

Page 10: Measures of Variation Among English and American Dialects

Results Distance measures are generally

correlated Nei’s distance and shared variants

are very similar, despite nonlinearity

Linguistic distance is least similar—contains different information about similarity of speech forms

Nei's Distance vs. Linguistic Distance

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8

Linguistic Distance

Nei

's D

ista

nce

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8

Linguistic Distance

Sh

ared

Var

ian

ts

Shared Varitants vs. Linguistic Distance

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Shared Variants

Nei

's D

ista

nce

Nei's Distance vs. Shared Varitants

Page 11: Measures of Variation Among English and American Dialects

Shared VariantsEast Midlands

East Anglia

Southeast

Southwest

Devonshire

Massachusetts

EVNC

SWVA

Page 12: Measures of Variation Among English and American Dialects

Nei’s Genetic DistanceEast Midlands

East Anglia

Southeast

Southwest

Devonshire

Massachusetts

EVNC

SWVA

Page 13: Measures of Variation Among English and American Dialects

Linguistic DistanceEast Midlands

East Anglia

Southeast

Southwest

Devonshire

Massachusetts

EVNC

SWVA

Page 14: Measures of Variation Among English and American Dialects

Distribution of Variants Some variants are widespread; others not

12% appear in all 8 regions 29% appear in 7 regions 42% appear in 6 regions 59% appear in 5 regions

• Even within regions, lots of variation• Informants in a given region typically share 60% to 75% of

variants, but range is 33% to 90%• Degree of variation reflected in genetic and linguistic distance

measures Southern England

• More diversity than in North America 91% of variants found somewhere 23% found in every region 20% found only in southern England• Shared variants between English informants 22% to 83%• Shared variants between English and American informants

18% to 63%

Page 15: Measures of Variation Among English and American Dialects

Distribution of Variants

North American regions Less diversity than in England—22% of southern English variants absent 80% of variants found somewhere 37% found in every region 9% found only in North America (12% of North American variants) Nearly half of American “innovations” shared across all N. American

regions Many “innovations” are known to have existed in southern England, but

were not recorded

• North American distribution of southern English variants• Slightly greater frequency of eastern (esp. southeastern) English variants

in American regions• Of 41 variants found in eastern but not in western England, 14 (34%)

appear in Massachusetts and in the South, 7 (17%) in Massachusetts but not in the South, 13 (32%) in the South but not in Massachusetts

• Of 33 variants found in western but not in eastern England, 5 (15%) appear in Massachusetts and in the South, 2 (6%) in Massachusetts but not in the South, 11 (33%) in the South but not in Massachusetts

Page 16: Measures of Variation Among English and American Dialects

Distribution of Variants• Massachusetts

• Both more and fewer shared variants with English informants than the South

• On average, more shared variants with the South than with English informants

• By all measures, MA informants show somewhat greater affinity with eastern English

• The South• EVNC and SWVA comparatively homogeneous and similar• Similar intra- and interregional variation• Similar variation with MA and England• Slightly greater affinity with western English than MA• Southern American informants have greatest number of shared variants

with Devonshire informants, but lowest linguistic distance with southeastern English informants

• Can illustrate differences using average values, or values for “typical” informants who have the greatest average number of shared variants or lowest average distance with all other speakers in region

Page 17: Measures of Variation Among English and American Dialects

Regional Comparison: AveragesShared Variants

EM EA SE SW DV MA EVNC SWVAEM 62% EA 65% SE 64% SW 62% DV 77% MA 71% EVNC 71% SWVA 74%EA 49% EM 49% EM 50% DV 48% SW 49% EVNC 45% SWVA 66% EVNC 66%SE 48% MA 40% SW 47% SE 45% SE 43% EM 44% MA 45% MA 43%MA 44% SE 40% MA 44% SWVA 37% SWVA 41% SWVA 43% EA 39% DV 40%DV 39% EVNC 38% DV 43% EM 36% EM 40% SE 42% EM 38% SE 39%EVNC 37% SWVA 37% EA 42% EA 36% MA 37% EA 41% SE 37% EA 38%SW 36% SW 36% SWVA 40% EVNC 34% EA 36% DV 36% SW 34% SW 37%SWVA 35% DV 34% EVNC 38% MA 31% EVNC 34% SW 31% DV 33% EM 36%

Nei's DistanceEM EA SE SW DV MA EVNC SWVA

EM 0.511 EA 0.454 SE 0.471 SW 0.493 DV 0.262 MA 0.347 EVNC 0.348 SWVA 0.302SE 0.723 EM 0.724 EM 0.723 DV 0.734 SW 0.734 EVNC 0.803 SWVA 0.424 EVNC 0.424EA 0.724 SE 0.907 SW 0.791 SE 0.791 SE 0.851 EM 0.824 MA 0.803 MA 0.853MA 0.824 MA 0.917 DV 0.851 SWVA 0.993 SWVA 0.903 SWVA 0.853 EA 0.961 DV 0.903DV 0.932 EVNC 0.961 MA 0.853 EA 1.033 EM 0.932 SE 0.853 SE 0.986 SE 0.943EVNC 0.989 SWVA 0.989 EA 0.907 EM 1.035 MA 1.009 EA 0.917 EM 0.989 EA 0.989SW 1.035 SW 1.033 SWVA 0.943 EVNC 1.097 EA 1.070 DV 1.009 DV 1.095 SW 0.993SWVA 1.047 DV 1.070 EVNC 0.986 MA 1.201 EVNC 1.095 SW 1.201 SW 1.097 EM 1.047

Linguistic DistanceEM EA SE SW DV MA EVNC SWVA

EM 0.726 EA 0.674 SE 0.632 SW 0.768 DV 0.385 MA 0.476 EVNC 0.515 SWVA 0.504SE 0.904 EM 0.981 EM 0.904 SE 1.013 SW 1.034 EVNC 0.942 SWVA 0.634 EVNC 0.634EA 0.981 SE 1.074 MA 0.963 DV 1.034 SE 1.070 SE 0.963 MA 0.942 MA 0.987MA 1.001 MA 1.116 SW 1.013 EM 1.217 SWVA 1.135 SWVA 0.987 SE 1.125 SE 1.123EVNC 1.133 DV 1.181 DV 1.070 EA 1.221 EM 1.142 EM 1.001 EM 1.133 DV 1.135DV 1.142 EVNC 1.211 EA 1.074 SWVA 1.224 EA 1.181 EA 1.116 EA 1.211 EM 1.184SWVA 1.184 SWVA 1.215 SWVA 1.123 EVNC 1.271 MA 1.253 DV 1.253 DV 1.257 EA 1.215SW 1.217 SW 1.221 EVNC 1.125 MA 1.331 EVNC 1.257 SW 1.331 SW 1.271 SW 1.224

Page 18: Measures of Variation Among English and American Dialects

Regional Comparison: “Typical” InformantsShared Variants

Rt5 Sf25 Sr44 Ox84 Dv68 MA11 NC2B V75ARt5 100% Sf25 100% Sr44 100% Ox84 100% Dv68 100% MA11 100% NC2B 100% V75A 100%MA11 57% Rt5 53% Rt5 60% Dv68 47% Ox84 50% Rt5 57% V75A 65% NC2B 59%Sr44 54% MA11 50% MA11 51% Sr44 45% V75A 42% Sf25 49% MA11 51% MA11 44%Sf25 52% V75A 39% Ox84 51% Rt5 39% Rt5 37% Sr44 46% Sf25 40% Dv68 41%Ox84 40% NC2B 36% Sf25 39% V75A 37% MA11 36% NC2B 46% Ox84 38% Sf25 39%Dv68 36% Sr44 35% NC2B 36% MA11 35% Sf25 35% V75A 43% Dv68 37% Ox84 38%NC2B 31% Ox84 34% Dv68 36% Sf25 33% NC2B 34% Ox84 36% Sr44 37% Sr44 32%V75A 30% Dv68 34% V75A 35% NC2B 33% Sr44 33% Dv68 35% Rt5 35% Rt5 30%

Nei's DistanceRt5 Sf26 Kt46 Ox84 Dv68 MA13 NC5A V72B

Rt5 0.00 Sf26 0.00 Kt46 0.00 Ox84 0.00 Dv68 0.00 MA13 0.00 NC5A 0.00 V72B 0.00MA13 0.62 Rt5 0.67 MA13 0.64 Dv68 0.72 Ox84 0.72 Rt5 0.62 V72B 0.46 NC5A 0.46Kt46 0.64 NC5A 0.89 Rt5 0.64 Kt46 0.75 V72B 0.87 Kt46 0.64 MA13 0.74 Dv68 0.87Sf26 0.67 MA13 0.98 Ox84 0.75 Rt5 0.94 Kt46 1.00 NC5A 0.74 Sf26 0.89 Kt46 0.93Ox84 0.94 V72B 0.99 V72B 0.93 V72B 0.98 Rt5 1.01 Sf26 0.98 Rt5 0.97 Ox84 0.98NC5A 0.97 Kt46 0.99 NC5A 0.99 MA13 1.08 MA13 1.04 V72B 1.01 Kt46 0.99 Sf26 0.99Dv68 1.01 Ox84 1.15 Sf26 0.99 NC5A 1.15 NC5A 1.06 Dv68 1.04 Dv68 1.06 MA13 1.01V72B 1.15 Dv68 1.42 Dv68 1.00 Sf26 1.15 Sf26 1.42 Ox84 1.08 Ox84 1.15 Rt5 1.15

Linguistic DistanceLi2 Sf25 Sr44 Ox84 Dv68 MA11 NC4B V67A

Li2 0.01 Sf25 0.00 Sr44 0.03 Ox84 0.07 Dv68 0.05 MA11 0.00 NC4B 0.03 V67A 0.02MA11 0.62 Li2 0.83 Li2 0.66 Sr44 0.92 Ox84 1.02 Li2 0.62 V67A 0.57 NC4B 0.57Sr44 0.66 MA11 0.95 MA11 0.76 Dv68 1.02 Sr44 1.08 Sr44 0.76 MA11 0.78 MA11 0.94Sf25 0.83 Sr44 1.01 Ox84 0.92 Sf25 1.25 Sf25 1.08 NC4B 0.78 Li2 0.96 Sr44 0.94NC4B 0.96 Dv68 1.08 V67A 0.94 MA11 1.28 Li2 1.11 V67A 0.94 Sr44 1.05 Li2 1.08V67A 1.08 V67A 1.18 Sf25 1.01 Li2 1.28 V67A 1.15 Sf25 0.95 Sf25 1.21 Dv68 1.15Dv68 1.11 NC4B 1.21 NC4B 1.05 V67A 1.28 MA11 1.24 Dv68 1.24 Dv68 1.30 Sf25 1.18Ox84 1.28 Ox84 1.25 Dv68 1.08 NC4B 1.43 NC4B 1.30 Ox84 1.28 Ox84 1.43 Ox84 1.28

Page 19: Measures of Variation Among English and American Dialects

Conclusions Different measures yield somewhat different,

complementary insights into linguistic variation By all measures, extensive variation in and among regions Patterns of variation—increasing in population and age of

settlement—are reminiscent of species-area relationship American settlement resulted in lower variation in

American regions, leveling, and somewhat different populations of variants in different regions

Slightly dominant influence from the metropolitan area Greater eastern influence in the north, western influence in the south Relatively little innovation Leveling process analagous to loss of species during reduction in habitat

Results are largely consistent with the historical record of early English immigration to North America (except for absence of East Anglian influence in Massachusetts)