6
A STAT I ST I CAL COMPAR I SON OF VERBS AND NOU NS I N ROGLAI L COBBEY 1 Word counts were made of four R�glai texts, distinguishing between (1) noun-words, (2) verb -words, and ( 3) all other words. All words were classified contextua lly as to grammatical c lass , so that, for instance , noun-words included adjectives, classifiers , quantifiers , etc . , in addition to nouns and pronouns, whenever they functioned endo centrically with nouns, or as noun substitutes. However, in another place an adjective might be counted as a verb-word because of its verbal function in that context . 1. WORD AND CLUSTER COU NTS Besides this word count of grammatical types , a count was also made of the number of clusters of each of the three types, a cluster being defined as a consecutive string o f one or mor e words of the same gram matical type . These counts of word types and cluster types are given in Table I. of History nkey le ast ls TABLE I Word and Cluster Counts rd t Verbs Oer l 1512 156 124 325 2117 753 137 78 209 1177 394 57 45 66 562 2659 350 247 600 207 Cluster t Verbs Oer l 603 92 68 173 936 601 91 42 153 887 305 42 42 63 452 1509 225 152 389 2275 Cobbey, M. "A Statistical Comparison of Verbs and Nouns in Rơglai". In Nguyễn Đ.L. editor, Southeast Asian linguistic studies, Vol. 4. C-49:207-212. Pacific Linguistics, The Australian National University, 1979. DOI:10.15144/PL-C49.207 ©1979 Pacific Linguistics and/or the author(s). Online edition licensed 2015 CC BY-SA 4.0, with permission of PL. A sealang.net/CRCL initiative.

WORD AND CLUSTER COUNTS

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: WORD AND CLUSTER COUNTS

A STAT I ST I CAL COMPAR I SON OF VERBS AND NOUNS I N ROGLAI

MAXWELL COBBEY

1 Word count s were made o f four R�glai text s , di st inguishing between ( 1 ) noun-word s , ( 2 ) verb-words , and ( 3 ) all other word s . Al l words were c la s s ified c ontext ually as to grammatical c lass , so t hat , for instance , noun-words inc l uded adj ec t i ve s , c lassifier s , quantifier s , etc . , in addi t ion to nouns and pronouns , whenever they func t ioned endo­centrically with nouns , or as noun subst itutes . However , in another p lac e an adj e c t ive might be c ounted as a verb-word because o f i t s verbal fun c t ion in that c ont ext .

1 . WORD A N D C L U S T E R C O U N T S

Be s ide s this word c ount o f grammat ical t ypes , a count was also made of the number of c lu sters of eac h of t he t hree t yp e s , a c luster be ing de fined as a c on se cut ive string of one or more words o f the same gram­mat ical t ype . The se counts of word t ypes and c luster types are given in Tab le I .

N:mE of Text

History fYbnkey Eagle Feast

Totals

TA B L E I Word and C lus t e r C o un t s

Word Count

N:>uns Verbs Other Total

1512 156 124 325

2117

753 137

78 209

1177

394 57 45 66

562

2659 350 247 600

2 07

Cluster Count

N:>uns Verbs Other Total

603 92 68

173

936

601 91 42

153

887

305 42 42 63

452

1509 225 152 389

2275

Cobbey, M. "A Statistical Comparison of Verbs and Nouns in Rơglai". In Nguyễn Đ.L. editor, Southeast Asian linguistic studies, Vol. 4. C-49:207-212. Pacific Linguistics, The Australian National University, 1979. DOI:10.15144/PL-C49.207 ©1979 Pacific Linguistics and/or the author(s). Online edition licensed 2015 CC BY-SA 4.0, with permission of PL. A sealang.net/CRCL initiative.

Page 2: WORD AND CLUSTER COUNTS

208 MAXWELL COBBEY

By making calculat ions based on the count s l i s t ed in Table I , various compari son s were made ( a ) between the four R�glai text s , and ( b ) between the grammatical types .

2 . R E LAT I V E NU M B E R O F WO R D S W I T H I N T E X T S2

Noun-words are more numerous t han any other type in all the t ext s ; and two of t he texts eac h consist o f more than hal f noun-word s , as seen in Tab le I I . I feel that these two text s , History and Feast , are by far the two best texts from a l it erary standpoint .

TA B L E 1 1 Re l a t ive Numb er o f Words w i th i n Tex t s

Text Noun Verb Other Total

Hi st ory 0 . 57 0 . 28 0 . 15 1 . 00 Monkey 0 . 4 5 0 . 3 9 0 . 16 1 . 0 0 Eagle 0 . 50 0 . 32 0 . 18 1 . 0 0 Feast 0 . 54 0 . 3 5 0 . 1 1 1 . 0 0

-Average3 0 . 55 0 . 31 0 . 15 1 . 00

3 . R E L AT I V E N U M B E R OF C L U S T E R S W I T H I N T E X T S

A c ompar ison of t he number of c lusters o f each gra��atical t ype within the texts ( Table III ) shows t hat for two o f t he t ext s , the Hi story ,ind the Monkey , there are about the same number o f noun and verb c lu s ­; ers . The se two texts c ontain more ac tion , whe reas t he Eagl e and t he "�east are more de script ive , and c onsequent ly show more noun c lusters "; han verb c lusters .

TAB L E 1 1 1 Re l a t iv e Numb e r o f C l u s t e r s w i th i n Tex t s

Text Noun Verb Other Total

History 0 . 4 0 0 . 4 0 0 . 2 0 1 . 0 0 Monkey 0 . 4 1 0 . 4 0 0 . 19 1 . 0 0 Eagle 0 . 4 5 0 . 28 0 . 2 8 1 . 00 Feast 0 . 4 5 0 . 3 9 0 . 1 6 1 . 0 0

Average3 0 . 4 1 0 . 3 9 0 . 2 0 1 . 00

Page 3: WORD AND CLUSTER COUNTS

A STATISTICAL COMPARISON OF VERBS AND NOUNS IN ROGLAI

4 . C L U ST ER L E NG T H S

2 0 9

From Table I V below it is seen that noun c luster s are longer o n the average than verb c luster s . The two be st lit erary t ext s ( Hi story and Feast ) are seen to have t he greatest difference betwe en noun cluster length and verb c luster length .

Noun

TA B L E I V Words p e r C lu s t e r

Verb Other Text Cluster Cluster Cluster Average

History 2 . 5 1 1 . 2 5 Monkey 1 . 7 0 1 . 5 1 Eagl e 1 . 8 2 1 . 8 6 Feast 1 . 88 1 . 37

Average3 2 . 2 6 1 . 3 3

5 . R E LAT I V E C L U ST E R L E NGT H S W I T H I N T E X T S

1 . 2 9 1 . 7 6 1 . 3 6 1 . 5 6 1 . 07 1 . 6 2 1 . 04 1 . 54

1 . 24 1 . 6 9

The relat ive c luster length o f each grammatical type within a text can be seen in Table V , which is obtained by div iding t he c luster lengths ( of Table IV ) by the average c luster l engt h of a text ( the last column o f Table IV) . Here it is seen that t he two best lit erary texts have the wide st range of relat ive cluster length between grammatical type s .

TA B L E V Re l a t i v e C lus t er Leng ths w i t h i n Tex t s

Text Noun Verb Other Range4

History 1 . 4 2 0 . 7 1 O . n 0 . 7 1 Monkey 1 . 09 0 . 9 7 0 . 8 7 0 . 22 Eagle 1 . 12 1 . 14 0 . 66 0 . 4 8 Feast 1 . 22 0 . 8 9 0 . 6 8 0 . 54

Average3 1 . 3 3 0 . 7 8 0 . 7 3 0 . 6 0

6 . R E L AT I V E C L U ST E R L E NG T H S B E T W E E N T E X T S

A comparison of c luster lengt hs between t e x t s for a given grammat ical t yp e was made b y dividing the c luster lengt h o f each t ext by the average c luster length for t he t yp e , as in Table VI . Verbs are seen to have the wide st range of re lative c luster lengths between t he text s .

Page 4: WORD AND CLUSTER COUNTS

2 1 0 MAXWELL COBBEY

TA B L E V I Re l a t ive C l us t er L eng th B e tween Tex t s

Text Noun

History 1 . 1 1 Monkey 0 . 7 5 Eagle 0 . 8 1 Feast 0 . 8 3

Range4 0 . 36

7 . D E N S I T Y O F G RAMMA T I CAL T Y PE S

Verb Other Average3

0 . 94 1 . 04 1 . 04 1 . 1 3 1 . 0 9 0 . 9 2 1 . 4 0 0 . 8 6 0 . 9 6 1 . 03 0 . 84 0 . 91

0 . 4 6 0 . 2 5 0 . 1 3

An emp irical density coe ffic ient was calcula ted for each c luster type within eac h text . Thi s was obtained by mul tiplying the entries o f Table V b y the en tries o f Table I I . The greater t h e relat ive c luster length o f a grammatical type , the greater its c oe ff ic ient o f density . And the greater the relative number o f words of that t yp e in a text , the greater the coefficient o f densit y . 5

TA B L E V I I Den s i t y o f Gramma t ic a l Typ e s w i t h i n Tex t s

8 . L I T E RAR I N E S S

Text

History Monkey Eagle Feast

Average3

Noun

0 . 81 0 . 4 9 0 . 5 6 0 . 66

0 . 7 3

Verb Other

0 . 2 0 0 . 09 0 . 38 0 . 14 0 . 3 6 0 . 12 0 . 31 0 . 07

0 . 2 4 0 . 11

As ment ioned previously , the two be st literary texts have the greate st differenc e s between average noun c luster length and verb c luster length . Another c harac teristic o f some good l iterary text s is relat ive short ne s s � f c lust ers , due to the grammatical typ e s be ing int ermixed i n s emi­poe t ical types of sentence s . These two somewhat opp o s ing charac t erist ic s � f c luster length are c ombined in an empirical formula for lit erarine ss , Table VI I I , which gives the absolute difference between noun and verb � luster length divided by t he average lengt h of c lusters in t hat text . The History and Feast texts come out wit h t he h ighe st c oe fficent s in lit erarine s s .

Page 5: WORD AND CLUSTER COUNTS

A STATISTICAL COMPARISON OF VERBS AND NOUNS IN RC1GLAI

TAB L E V I I I Coe f f ic ient o f L i t er a r i ne s s

Hi story Monkey Eagle Feast

Word-weight ed average Non-weighted average

9 . A SAM P L E E NG L I S H C O M PAR I SO N

0 . 71 0 . 12 0 . 02 0 . 3 3

0 . 5 3 0 . 3 0

211

Simi lar counts and calculat ions were made of an art i c le in English for c ompari son , u s ing an art ic le from t he R ead � ' ¢ Vig e¢t , October 1 97 5 : 7 3 - 7 .

Noun Verb Other Total

I . Word count s 1 0 3 3 3 4 6 3 4 4 1 7 2 3 Clus t er c ount s 5 3 7 1 9 8 2 8 5 1 0 2 0

I I . Re 1 . n o . o f words 0 . 60 0 . 2 0 0 . 2 0 I I I . Re 1 . no . of c lusters 0 . 5 3 0 . 19 0 . 2 8

IV . C luster lengt h 1 . 9 2 1 . 7 5 1 . 2 1 V . Re 1 . c luster lengt h 1 . 14 1 . 03 0 . 7 1

VI . ( on ly one t ext ) VII . Density o f gram . types 0 . 6 8 0 . 2 1 0 . 14

VI I I . Lit erar ines s c oe f f . 0 . 1 0

From I I and I I I above , t he English text is seen to have a greater relative number of both noun word s and o f noun c lusters t han t he Rdglai text s . And t he English text has a smal ler relative number of verb words and verb c lusters t han any o f the Rdglai texts .

Tab l e s IV and V reveal t hat both the cluster length and t he relative c luster length of the English t e xt fal l wit hin t he range of t he Rdglai text s . The noun and verb density c oe fficient s ( VI I ) o f t he English t ext also fal l within the range of t he Rdglai text s .

The coe ffic ient o f ' l it erarine s s ' of t he English t ext i s lower t han three o f the Rdglai t e xt s . We defined ' literarine ss ' solely for Rdglai as having s hort c lust ers and a great di fference between noun and verb c luster lengt h . Good lit erary st yle i n Engl ish c l early has d ifferent charac teristic s from Rdglai .

Page 6: WORD AND CLUSTER COUNTS

MAXWELL COBBEY

N O T E S

1 . Rdglai i s a language belonging to the Coastal Chamic branch of �ustrones ian , found in sout h Vietnam inland from Nhatrang t o Phanthiet .

2 . The re lat ive number o f nouns in t he Hist ory t e xt , for instance , is found by dividing the number of nouns in the His tory t e xt , 1512 ( Tab le I ) , by t he total number of words in the History text , 2 6 5 9 ( Table I ) : which gives 0 . 57 .

3 . The se averages are not obtained by averaging the figures given in '�his c hart , but must be obtained from the origi nal data of Tab le I ; for lnstance , for the nouns of Tab le II 2 1 1 7 7 3 8 5 6 = 0 . 5 5 .

. � . The range , for instance , o f relative c luster length in the History �ext is 1 . 4 2 -0 . 71 = 0 . 7 1 ; for the Monkey text 1 . 0 9 - 0 . 8 7 = 0 . 2 2 .

' 5 . Thi s may also be stated : within each text t he density coefficient s o f the respe c tive grammat ical types vary in direct proport ion to both ' �he re lat ive lengths o f c lusters o f each t yp e and t he relat ive numbers ( )f word s of eac h t ype . It may be calculated directly for eac h gram­mat ical type by the formula :

Density coeffic ientA x Clusters 'll

ClustersA

where A the given grammatical type T the total words ( or c luster s ) of all type s .

2 1 2

Cobbey, M. "A Statistical Comparison of Verbs and Nouns in Rơglai". In Nguyễn Đ.L. editor, Southeast Asian linguistic studies, Vol. 4. C-49:207-212. Pacific Linguistics, The Australian National University, 1979. DOI:10.15144/PL-C49.207 ©1979 Pacific Linguistics and/or the author(s). Online edition licensed 2015 CC BY-SA 4.0, with permission of PL. A sealang.net/CRCL initiative.