84
Growth in Grammar Project Annotation Manual Mark Brenchley Phil Durrant

Growth in Grammar Project Annotation Manual

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Growth in Grammar Project Annotation Manual

Growth in Grammar Project

Annotation Manual

Mark Brenchley Phil Durrant

Page 2: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  2  

Table of Contents

General Overview ................................................................................................................ 3

Noun Phrase Overview ....................................................................................................... 4

Subordinate Clause Overview ............................................................................................ 11

General Coding Guidance .................................................................................................. 17 Core Principles ........................................................................................................................... 17 Ambiguities ................................................................................................................................. 18 Preference Rules ......................................................................................................................... 18 Uncodable Material .................................................................................................................... 18 Grammatical Anomalies ............................................................................................................. 19

Specific Coding Guidance ................................................................................................. 24 Apposition ................................................................................................................................... 24 Auxiliary Verbs ........................................................................................................................... 27 Clefts ........................................................................................................................................... 28 Comparative Phrases & Clauses ................................................................................................ 30 Complex Words .......................................................................................................................... 31 Coordinators & Coordination ..................................................................................................... 38 Cross-Sentence Dependencies ................................................................................................... 40 Determiners ................................................................................................................................ 41 Direct Speech (& Reporting Clauses) ........................................................................................ 42 Discontinuous Structures ........................................................................................................... 46 Dislocations ................................................................................................................................ 47 Ellipsis ........................................................................................................................................ 48 Existential Clauses ..................................................................................................................... 50 Extraposed Clauses .................................................................................................................... 51 Fronted Material ......................................................................................................................... 52 Gapping ...................................................................................................................................... 53 Genitival NPs .............................................................................................................................. 55 Gerunds ....................................................................................................................................... 57 Independent NPs and SCs ......................................................................................................... 58 Negation with “Not” .................................................................................................................. 59 Nominal Relatives ...................................................................................................................... 62 Quotations .................................................................................................................................. 63 Participle Forms ......................................................................................................................... 64 Particles ...................................................................................................................................... 65 Particle + Adverb/Preposition Sequences ................................................................................. 66 Passsives ..................................................................................................................................... 67 Prepositions ................................................................................................................................ 68 Prepositional Verbs ..................................................................................................................... 69 Proper (vs Common) Nouns ...................................................................................................... 70 Subject-Verb Inversions ............................................................................................................. 71 Subordinators & Relativizers ...................................................................................................... 72 Tag Clauses ................................................................................................................................ 74 Verbless Clauses ......................................................................................................................... 75

Appendix I: Gig_pos Code List ......................................................................................... 78

Appendix II: Gig_dep Code List ....................................................................................... 80

References ......................................................................................................................... 84  

Page 3: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  3  

General Overview

 This manual covers the hand-parsing of the Growth in Grammar (GiG) corpus. Supplementary information relating to idiosyncratic annotations can be found in the accompanying document named Supplementary Guidance. All annotations are to be made to .csv versions of the project texts. Each such file will have been pre-parsed by the Stanford NLP suite, and look like this:

Manual annotation is to be undertaken for the following columns within these files:

Element Status (“status”) Marks whether the element for that particular row marks the head of a noun phrase or a subordinate clause. Where it heads a noun phrase, insert an np; where it heads a subordinate clause, insert an sc. Part-of-Speech (“gig_pos”) Marks the part-of-speech code for each word that comprises each np and sc. All annotations should be drawn from the gig_pos code list set out in appendix I. Except for proper noun and specific instances of complex words (see the relevant section below), all codes should be determined through reference to the online version of the Macmillan English Dictionary:

https://www.macmillandictionary.com/ Part-of-Speech Dependencies (“gig_dep”) Marks the particular np/sc dependencies explicitly recognised here. All annotations should be drawn from the gig_dep code list set out in appendix II and as described below. Part-of-Speech Dependency Chain (“gig_dep_on”) Marks the chain of dependencies for each word that comprises each np and each sc. A cardinal number is to be inserted for each such dependency, with this number matching the word_number entry for the element on which the np or sc component depends. Additional Information (“add_info”) Records any supplementary coding; in particular, the code identifying the subsidiary elements of a complex lexical item (extra), the code for ambiguous items (ambig_), and the code for an np/sc that you have been unable to code in full (uncodable) Generic Annotator Notes (“notes”) Records any comments that might be useful for interpreting the annotations. A few specific codes for this column are specified in the manual below. However, you should feel free to add ad hoc comments as you see fit. Transcription Corrections (“corrections”) Records any corrections that you think might need to be made to the original transcriptions. This should always be of the format: <“original material” to “corrected material”>. For example, “happi” to “happy”. Note that annotators should never make any alterations to the pre-parsed material. Such alterations will be done by the permanent members of the project team.

sentence_number word_number word lemma pos dep_on dep status gig_pos gig_dep gig_dep_on add_info notes corrections

1 1 Chomsky Chomsky NNP 2 nusbj

1 2 loves love VBP 0 ROOT

1 3 syntax syntax NN 2 dobj

1 4 . . . 2 punct

Page 4: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  4  

Noun Phrase Overview

The first structures to be hand-annotated are all the noun phrases that are present within a text. All such noun phrases, together with their constitutive material, will require explicit coding according to the conventions detailed here and in the remainder of the manual.

For example,

word # word status gig_pos gig_dep gig_dep_on add_info

1 An det 6

2 unsurprising part premod 6

3 but conj_coord 5

4 truly adv 5

5 grotesque adj premod 6

6 lie np noun_com subj 15

7 that conj_sub 10

8 Chomsky np noun_prop subj 10

9 had verb_aux 10

10 lost sc verb_lex_act postmod_fin 6

11 all det 12

12 reason np noun_com dobj 10

13 was

14 being

15 whispered

16 to

17 all np pro prepobj 16

18 of prep postmod 17

19 the det 21

20 gullible adj premod 21

21 journalists np noun_com prepobj 18

This should be done as follows.

Page 5: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  5  

STATUS COLUMN

Insert an np for each word that instantiates a distinct noun phrase within a text. In the default case, you should count each of the following as instantiating a distinct noun phrase:

a. Any common noun (e.g. “Cats love dogs”)

b. Any proper noun (e.g. “Uncle Chomsky loves Lord Quirk”)

c. Any pronoun, including relative pronouns (e.g. “We love anyone that teaches themselves”)

d. Any adjective that heads a phrase which functions as a subject or object

e. Any other adjective that has either a numeral or a determiner dependent on it [cf. LGSWE, p.259] (e.g. “It is always the rich against the poor; “Five blue beats three green”)

f. Any numeral that heads a phrase functioning as subject, object, or prepositional object (e.g. “They will be here in three”)

g. Any other numeral that is either referential (i.e. a date or an age) or which has a determiner dependent on it (e.g. “We are those five”)

There are two exceptions to the above.

1) Complex nominals, such as “yard arm”, “The Royal Academy”, or “forty three” in “I bought forty three”. Each such complex item should only be assigned one np marker. The precise rules regarding such cases are set out in the section on complex words below. For example,

word # word status gig_pos gig_dep gig_dep_on add_info

1 We np pro subj 2

2 love

3 Noam np noun_prop_complex dobj 2

4 Chomsky noun_prop_complex 3 extra

2) Any case of (a)-(e) above which are coordinated such that another element is effectively dependent on both coordinates (e.g. “the cats, dogs, and mice”, “blue cats and dogs”, “Cats and dogs from Paris”).

In such cases, all gig_pos codes should be assigned as normal. However, only a single np should be marked in the Status row for the first of the coordinates.

Furthermore, each dependency that applies to the coordinated nominals should be marked such that their gig_dep_on code links back to this first np. Finally, note that only this first np should receive a distinct gig_dep code reflecting the function of the whole np. The remaining coordinated nominals should receive the generic gig_dep code of coord and have their gig_dep_on code marked as linking to the head np.

Page 6: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  6  

For example,

word # word status gig_pos gig_dep gig_dep_on

1 I np pro subj 2

2 love

3 all det 4

4 cats np noun_com dobj 2

5 and conj_coord 7

6 happy adj premod 7

7 dogs noun_com coord 4

8 that np pro_rel subj 9

9 run sc verb_lex_act postmod_fin_rel 4

Gig_pos Column

Code each element that comprises each np for its particular part-of-speech, using only elements from the gig_pos code list set out in appendix I and taking account of any complex parts-of-speech that should be coded as such (see the section on complex words below).

Note that any element marked as instantiating an np should receive its normal gig_pos code, whether or not it has been coded with an np marker. Thus, for example, a numeral functioning as in (e) above should still receive the gig_pos code of num, even though it should be marked with an np in the Status column!

Gig_dep Column

A gig_dep code should be assigned for two kinds of np dependency: the external dependency of the np, and certain internal dependencies within the np.

All of the relevant codes are listed in appendix II. Note, however, that you may also need to add some General Secondary Classifiers as appropriate. These should always be appended to the end of the gig_dep code, and in the order set out in appendix II.

Note, also, that we make no distinction between restrictive and non-restrictive dependencies, nor between “complementation” and “modification” structures. All such elements should simply be treated as premodifiers or postmodifiers according to the relevant gig_dep codes detailed here.

Furthermore, you should note that not every dependent of the np will be part of a contiguous sequence. This does not affect their status as a dependent of the np, meaning that they will still require a gig_dep_on that links directly to the np marker. For example, the underlined sequence in “The rumour spread that Chomsky was lying” should still be treated as a postmodifier of “rumour”, even though there is an intervening verb.

Page 7: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  7  

Finally, note that in the case of coordinated dependencies, each coordinate should be marked as exerting its own dependency. Thus, in the phrase “The great and noble Chomsky”, each adjective will require a separate internal gig_dep code of premod; and each np in “Chomsky and Pullum laughed maniacally” would require the external gig_dep code of subj.

With this in mind, you will need to code for the relevant internal and external dependencies of each np, the general rules for which are as follows:

1) NP External Dependencies

§ Assign a gig_dep code to each element marked with an np, with this coded according to the wider function of the np itself.

§ All codes should be drawn from the gig_dep list in appendix II, using the supplementary information provided in the rest of the manual. In particular, you should make sure to read the following sections before coding the external dependencies of an np:

o Appositions

o Determiners

o Dislocations

o Fronting

o Genitival NPs

o Independent NPs and SCs

o Prepositional Verbs

2) NP Internal Dependencies

§ Internal direct dependents of an np should be identified using the following template:

{peripheral modifiers} {determiners + numerals} {premodifiers} np head {postmodifiers}

Hence, for example, the following, with each direct dependent in bold:

{even} {this} {very minor and relatively trivial} analysis {which Chomsky loved}

§ Within this template, only two types of internal dependency require an explicit gig_dep code: premodifiers and postmodifiers.

§ Note, however, the specific cases where a genitival np has been used within a wider noun phrase. For our purposes, this effectively counts as a distinct np that functions as a determiner of a wider noun phrase.

§ A premodifier is any np-level part-of-speech which both precedes the np marker and comes after any determiners or numerals that also depend on this marker.

Premodifiers will generally be:

o Adjectives (e.g. “A very special linguist”)

o Adverbs (e.g. “The nearby institutions”)

o Clauses (e.g. “The I don’t know what he thought it was analysis”)

o Nouns (e.g. “That wonderful Chomsky book”)

o Participials (e.g. “A collapsing paradigm”)

§ A postmodifier is any np-level part-of-speech that both depends on the noun phrase and comes after the np marker that instantiates the noun phrase.

Page 8: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  8  

Again, this postmodificatory relationship holds whether or not the dependent is interrupted by another piece of material. For example, the underlined portion in the following would still count postmodifier, even though it is interrupted by an intervening verb “Word spread that Chomsky was wrong”:

word # word status gig_pos gig_dep gig_dep_on

1 Word np noun_com subj 2

2 spread

3 that conj_sub 5

4 Chomsky np noun_prop subj 5

5 was sc verb_lex_act postmod_fin 2

6 wrong adj 5

Note, also, that the default coding is for sequences of the form NP1 + of + SC/ NP2 to be coded such that the SC/NP2 receives the gig_dep code of prepobj with its gig_dep_on marked as linking back to the “of”. This “of” should then receive the gig_dep code of postmod with its gig_dep_on code marked as linking back to N1/P1.

word # word status gig_pos gig_dep gig_dep_on

1 All np pro subj 5

2 of prep postmod 1

3 the det 4

4 linguists np noun_com prepobj 2

5 cackled

Postmodifiers will generally be:

o Adjectives (e.g. “An analysis of Chomsky awful and vain)

o Adverbs (e.g. “The way out”)

o Complement Clauses (e.g. “The idea that people are dismissing Chomsky”)

o Discontinuous Comparatives (e.g. Chomsky is a better linguist than me”)

o Ed-Clauses (e.g. “That amazing book written by Chomsky in his sleep”)

o Infinitive Clauses (e.g. “The book to read as if your life depended on it”)

o Ing-Clauses (e.g. “That wonderful book crying out for greater attention”)

o Prepositional Phrases (e.g. “That amazing book on noun phrases”)

o Relative Clauses (e.g. “That great book that Chomsky wrote in his sleep”)

Page 9: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  9  

Gig_dep_on Column

This should be coded in full for each element that comprises each np, with each element that is a direct dependent of the clause coded as linking to the clause’s np marker.

Note, again, however, that not every dependent of the np will be part of a contiguous linear sequence. This does not affect their status as a direct dependent of the np, meaning that they will still require a gig_dep_on that links directly to the np marker.

Page 10: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  10  

Final NP Example

word # word status gig_pos gig_dep gig_dep_on

1 A det 4

2 pathetically adv 3

3 vile adj premod 4

4 rumour np noun_com subj 5

5 spread

6 through

7 the det 8

8 clique np noun_com prepobj 6

9 that conj_sub 12

10 Chomsky np noun_prop subj 12

11 was verb_aux 12

12 losing sc verb_lex_act postmod_fin 4

13 the det 14

14 argument np noun_com dobj 13

15 because conj_sub 17

16 he np pro subj 17

17 was sc verb_lex_act adv_fin 12

18 a det 19

19 zealot np noun_com predsubj 17

Page 11: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  11  

Subordinate Clause Overview

The other structures requiring annotation are those subordinate clauses explicitly recognized as such within the present manual.

A subordinate clause essentially comprises any material that would be classed as a dependent clause by the Longman Grammar of Spoken and Written English. All such clauses, together with the material that constitutes them, should be coded according to the conventions set out here and further detailed in the remainder of the manual.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 Chomksy np noun_prop subj 2

2 won

3 because conj_sub 5

4 Skinner np noun_prop subj 5

5 sucks sc verb_lex_act adv_fin 2

Generally, this means that we recognise the following core clause types, both finite and non-finite [cf. LGSWE:192-201]:

o Adverbial Clauses. Any clause functioning adverbially with respect to a surrounding clause. We include here standard adverbials, sentence adverbials, comment clauses, and tag clauses (e.g. “I love Chomsky because he speaks the truth”).

o Postmodifying Clauses. Any clause that is dependent on a preceding noun phrase (e.g. “I love people who love Chomsky”; “I hate the idea that people would forget Chomsky”), including comparative postmodifiers (e.g. “Chomsky is a better linguist that we are”).

o Non-Postmodifying Comparative Clauses. Any clause functioning as the dependent of a graded adjective or adverb (e.g. “Chomsky is better than I am”; “Chomsky moved so fast that we were in awe”).

o Complement Clause. Any clause fulfilling one of the following dependencies:

subject (e.g. “That Chomsky should fail makes me sad”)

object (e.g. “I want Chomsky to succeed”)

predicative complement (e.g. “I consider Chomsky to be the greatest linguist who ever lived”)

extraposition (e.g. “It is unlikely that Chomsky is wrong”)

prepositional complement (e.g. I rely on Chomsky being right”)

adjectival complement (e.g. “I am confident that Chomsky is undefeatable)

Page 12: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  12  

However, you should note that we do not recognise the following as subordinate clauses per se: verbless clauses (e.g. He left with his hands on his head) and reporting clauses (“Syntax!”, Chomsky shouted). These are not subordinate clauses for our purposes and should not be coded as such. For further information on coding the specific components of such material, see the sections respectively entitled verbless clauses and direct speech and reporting.

Such cases aside, the columns for the other subordinate clause tyoes should be coded as follows.

Status Column

Insert an sc for each part-of-speech that instantiates a distinct subordinate clause. In the default case, this will be the lexical verb that heads the clause. Note that this will be so even where the clause is explicitly introduced by a subordinator or relativizer. All such parts-of-speech should instead be coded as internal dependents of the clause.

All subordinate clauses are to be identified using a subject + verb phrase operationalisation. In the default case, this means that an sc should only be identified where the instantiating element is accompanied by its own explicit subject.

Thus, the following bracketed sequences would each be counted as a distinct subordinate clause. Hence, each underlined verb should be marked with a distinct sc:

“I saw [Chomsky destroying Skinner] [and Skinner crying like a baby]”.

“I love Chomsky because [he makes grammar beautiful] [and he destroyed Skinner]”;

Conversely the following bracketed sequences would each be counted as a single subordinate clause, with a single sc inserted next to the first lexical verb but the second verb receiving the special gig_dep code of coord and marked as linking back to the verb that heads the sc:

“I saw [Chomsky destroying Skinner and vanquishing Piaget]”;

“I wished [for Chomsky to destroy Skinner and to vanquish Piaget]”.

The exception concerns material where no explicit subject is present. For example, the bracketed sequence in “I want [to read Chomsky]”. All such cases should be treated as if such a subject were in fact present.

Furthermore, where this implied subject is followed by a coordinated set of verb phrases, then a distinct sc should be marked for each coordinate verb phrase only where the phrase’s implied subject is not co-referential with the preceding subject.

Thus, the following bracketed examples would be counted as only a single subordinate clause, with a single sc inserted next to the first lexical verb and the second verb receiving the gig_dep code of coord and marked as linking back to the first verb:

“I want [to go home and read Chomsky’s greatest hits]”;

“Chomsky is the guy [to vanquish and to destroy]”.

Conversely, the following bracketed sequences would each be counted as distinct subordinate clause. Hence, each lexical verb should be marked with a distinct sc:

“I want [him to go home] [and to read Chomsky’s books myself]”.

Example

Page 13: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  13  

word # word status gig_pos gig_dep gig_dep_on

1 If conj_sub 3

2 you np pro subj 3

3 want sc verb_lex_act adv_fin 10

4 to conj_sub 5

5 destroy sc verb_lex_act obj_nfin 3

6 Skinner np noun_prop dobj 5

7 and conj_coord 8

8 vanquish verb_lex_act coord 5

9 Saussure np noun_prop dobj 8

10 read

11 more det 12

12 books np noun_com dobj 10

Gig_pos Column

Code each element that comprises each sc for its particular part-of-speech, using only elements from the gig_pos code list set out in appendix I and taking account of any complex parts-of-speech that should be coded as such (see the section on complex words below).

Gig_dep Column

A gig_dep code should only be assigned to three elements within each identified sc.

1) Firstly, a gig_dep code should be assigned to the part-of-speech that marks the sc, with the specific coded assigned depending on the type of subordinate clause it is.

Each such sc will be a composite code, built out of the sc dependencies specified in the gig_dep code list set out in appendix II.

Each such code should be built in the order set out in the gig_dep code lists, beginning with the primary sc classifiers and cycling through the secondary sc classifiers, and the tertiary sc classifiers as applicable.

Note, finally, that you may need to add some general secondary classifiers as appropriate. These should always be appended to the end of the gig_dep code in the order set out.

For example, a finite comparative clause dependent on an adjective will have the code adjmod_fin_comp (e.g. “Chomsky is as beautiful [as you can imagine]”), a fronted non-finite adverbial clause will be adv_nfin_front (e.g. “[to succeed] we must read every Chomsky book there is”), a standard noun-modifying relative clause will be postmod_fin_rel (“I hate

Page 14: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  14  

grammars that dismiss Chomksy”) and a standard non-finite noun-modifying clause will be postmod_nfin (“I love grammars written by Chomsky”).

Note, however, that no distinction is made here between predicative and object complement clause functions. In both cases, these should be coded using the Primary SC Classifier of obj_.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 He np pro subj 2

2 thought

3 Chomsky np noun_prop subj 4

4 seemed sc verb_lex_act obj_fin 2

5 to conj_sub 7

6 be prep 7

7 winning sc verb_lex_act obj_nfin 4

2) Secondly, each np within the sc should be assigned a dependency according to its function within that sc (e.g. subj if in subject function, obj if in object function, predsubj if a predicative subject, prepobj if the complement of a preposition &c).

Note, however, the special sc_np code. This should be used to code the gig_dep for any post-matrix np that both precedes an obj_nfin clause & can function as the subject of this non-finite subordinate clause. The gig_dep_on for this sc_np should then be marked as linking to the clause’s sc marker.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 I np pro subj 2

2 want

3 Chomsky np noun_prop sc_np 5

4 to conj_sub 5

5 destroy sc verb_lex_act obj_nfin 4

6 Skinner np noun_prop dobj 5

3) Finally, where a non-finite clause contains a prepositional phrase subject (i.e. “for np”), then the preposition head of this phrase should receive the gig_dep code of sc_pp, with the gig_dep_on for this part-of-speech coded as linking to the clause’s sc marker. The gig_dep and gig_dep_on code for the np should then be coded as normal.

Page 15: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  15  

For example,

word # word status gig_pos gig_dep gig_dep_on

1 For prep sc_pp 4

2 us np pro prepobj 1

3 to conj_sub 4

4 succeed sc verb_lex_act adv_nfin_fronted 6

5 Skinner np noun_prop subj 6

6 must

7 fail

Gig_dep_on Column

This should be coded in full for each element that comprises each sc, with each element that is a direct dependent of the clause coded as linking to the clause’s sc marker. Note, again, however, neither subordinators nor relativizers are to be counted as sc heads. Instead, they should have their gig_dep_on code treated as also linking to the clause’s sc marker, since they are here treated as dependents of this clause.

Where a particular subordinate clause is dependent on two or more coordinated elements, then the gig_dep_on code for the clause’s sc marker should be marked as linking back to the first of the coordinated elements. For example, the underlined subordinate clause in “I hope and believe and think that Chomsky is good” should be marked as linking back to “hope”, not “believe” or “think”.

The gig_dep_on code for the element that has been assigned the sc marker should then be coded as follows:

o For postmodifying clauses, the default is for the sc marker to be coded as linking to the np marker for the noun phrase that the clause is taken to modify.

o For complement & adverbial clauses, the default is for the sc marker to be coded as linking to the surrounding verb on which the complement/adverbial clause depends. For the specific details of nominal relative clauses, see the section on nominal relatives below.

o For the details of appositional clauses, clefts, comparative clauses, dislocation clauses, existential clauses, extraposed clauses, see the relevant sections in the remainder of the manual below.

o For the details of so-called “independent” subordinate clauses, see the section on independent noun phrases and subordinate clauses below.

Page 16: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  16  

Final SC Example

word # word status gig_pos gig_dep gig_dep_on

1 Because conj_sub 3

2 I np pro subj 3

3 am sc verb_lex_act adv_fin_fronted 7

4 completely adv 5

5 obsessed adj 3

6 I np pro subj 7

7 wanted

8 Chomsky np noun_prop sc_np 10

9 to conj_sub 10

10 crush sc verb_lex_act obj_nfin 7

11 those pro dobj 10

12 who np pro_rel subj 13

13 love sc verb_lex_act postmod_fin_rel 11

14 Skinner np noun_prop dobj 13

Page 17: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  17  

General Coding Guidance

Core Principles

The core principle is to give a good faith annotation of each text, crediting the student with having produced the maximum possible amount of grammatical material relative to the conventions of Standard British English.

Furthermore, you should aim to code only the material as it has actually been written, assigning a reading that is both grammatically correct and semantically coherent. You should never change or edit any material in order to improve it. Where you identify a possible transcription error, you should simply note this error in the Corrections column using the format specified above.

Finally, your first point of reference should always be this manual, which details the specific codes and general rules for annotating your texts. That said, we realise that the nature of annotation means that any manual can be insufficient for making particular decisions. As such, you should make subsidiary reference to the following sources:

1. Supplementary Guidance. This is a subsidiary document provided by the project, designed to encompass tricky cases not otherwise covered by the manual or the external websites referenced below. This document should be checked before turning to additional source (2). It is also a “live” document that will be continuously updated by the main project team on the basis of the information you provide. Accordingly, you should not hesitate to let us know if you think something should be included here.

2. Longman Grammar of Spoken and Written English [LGSWE]. Where the classification of a particular dependency is not explicitly determined by either this manual or the Supplementary Guidance document, you should turn to the Longman Grammar of Spoken and Written English.

Page 18: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  18  

Ambiguities

On occasion, there will be material which you could in principle code in more than one way, but where you feel genuinely unable to decide option makes the most sense in the context. In such cases you should use the following preference rules. Note, however, that we expect such cases to be rare, and that you will make minimal reference to these preference rules, since you should normally have a clear intuition based on the context.

Furthermore, wherever you do decide to make use of a preference rule, we ask that you provide the complete alternative analysis for each element in the Add_info column as appropriate, prefixing this analysis with the code ambig_. For example,

Preference Rules

Ø If a part-of-speech is ambiguous between a verb and a predicative adjective, then code it as an adjective (adj) where (a) it can be modified by the adverbs “too” or “very”, (b) it can serve as the predicative complement of a copular verb such as “seem” or “remain”, or (c) it can be given both a stative and a dynamic reading. Otherwise, code it as a verb (i.e. verb_lex_act or verb_lex_pass).

Ø If an element is ambiguous between a verb and a common noun (i.e. noun_com), then code it as a verb (i.e. verb_lex_act or verb_lex_pass).

Ø If a dependency is ambiguous between an adverbial & a postmodifier coding, then select the adverbial coding (i.e. adv_).

Ø If a dependency is ambiguous between an adverbial & a complement clause coding, then select the adverbial coding (i.e. adv_)

Ø If a dependency is ambiguous between a postmodifier & a complement clause coding, then select the postmodifier coding (i.e. postmod_)

Ø If a dependency is ambiguous between a coordination account and either an ellipsis or a gapping account, then code it as a coordination.

Uncodable Material

Where you cannot annotate a particular np or sc for all of its component elements, you should simply code as much of that particular np/sc as possible and then make sure that the gig_dep_on numbers for any uncodable elements link back to the head of that particular np/sc. Finally, you should insert the code uncodable in the Add_info row for that part-of-speech within the unit which has been marked with an np/sc in the Status column.

# word status gig_pos gig_dep gig_dep_on add_info

1 I np pro subj 2

2 think

3 they np pro subj 5 ambig_np_noun_pro_subj_4

4 are verb_aux 5 ambig_sc_verb_lex_act_obj_fin_2

5 revolting sc verb_lex_act obj_fin 2 ambig_adj_4

Page 19: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  19  

Grammatical Anomalies

Grammatical anomalies refer to sequences that do not neatly conform to the conventions of Standard British English, such as morphosyntactic errors (e.g. “People think Chomsky are great”) or the dialectical use of “what” as a relative pronoun (e.g. “A book what I love”).

All such cases should be marked by appending the code _gram to the gig_dep code for the cell(s) marking where the “anomaly” occurs. Note, that you should append this code to the relevant cells even where no other coding is otherwise required, such that a gig_dep cell may contain only the _gram code itself. This is to enable a fuller understanding of where the relevant anomalies occur within a text.

In addition, you should apply the following rules according to the specific case at hand.

A. Semantic Oddities

These occur where the grammatically correct coding of the material as written is such that it nevertheless yields a semantically strange reading. This is fine for our purposes, and you should simply assign the grammatically correct coding to the material. For example, the underlined sequence in “He pushed his way through the rocks and the seaweed to find a shoal of small sunfish brushed past him” would be coded as follows:

B. “Forgotten” Material

So-called forgotten material comprises material where an element is missing which cannot be attributed to a wider grammatical phenomenon such as ellipsis or gapping (see below), and where the remaining material is such that it would otherwise require explicit coding as normal; for example, the underlined material in the sentence “He looked me.” All

word # word status gig_pos gig_dep gig_dep_on

11 to conj_sub 12

12 find sc verb_lex_act adv_nfin 2

13 a det 14

14 shoal np noun_com subj 18

15 of prep postmod 14

16 small adj premod 17

17 sunfish np noun_com prepobj 15

18 brushed sc verb_lex_act obj_fin_gram 12

19 past prep 18

20 him np pro prepobj 19

Page 20: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  20  

elements present within these sequences should be given a good faith interpretation based on what is actually written, with any ungrammaticality treated as arising due to the missing elements.

This should happen as follows:

1. Determine whether any remaining material is dependent on a missing element that would normally be marked with an np or an sc in the Status column.

Thus, the underlined material in “He looked me” would require no such marking, since the missing preposition would not otherwise require either an np or an sc. On the other hand, “A came” would require such a marker, since the missing noun would receive an np marker if it were present.

2. Where no such np/sc marker is necessary, you should assign all the elements their standard Status, gig_pos and gig_dep codes, remembering to append the _gram code as appropriate to mark the gig_dep cell(s) where the anomaly occurs.

The gig_dep_on codes should also be assigned as normal, except where an element would normally be linked to part of the forgotten material. Here, the gig_dep_on code should be coded as “skipping” over the missing element, and linking to the word on which the missing element would itself depend.

For example, the “me” in “He looked me” would be coded as linking back to “look”, since this “me” would normally link to the missing preposition and this preposition would otherwise link to “looked”.

word # word status gig_pos gig_dep gig_dep_on

1 I np pro subj 2

2 saw

3 Chomsky np noun_prop sc_np 4

4 looking sc verb_lex_act obj_nfin 2

5 me np pro prepobj_gram 4

3. Where such an np/sc marker would normally be required, assign an appropriate replacement np or sc marker to the closest word which both depends on and immediately precedes the element that would normally instantiate the np/sc.

For example, in “A came”, you would assign an np to the Status column for “a”, since this both depends on and immediately precedes the noun that would otherwise be marked with the np. However, in “A broken came”, you would assign an np to “broken”, since it is this element that now depends on and immediately precedes the “forgotten” noun.

4. Next, you should assign all the elements present their standard gig_pos codes as normal. For example, in “A came”, the “a” would still be coded as det.

Page 21: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  21  

5. Next, you should assign all the elements present their standard gig_dep codes. For example, in “A came”, neither “a” nor “came” would receive an initial coding, since neither would normally require such a coding.

6. Next, you will need to go back and add any additional dependencies that would otherwise have been marked were the missing np/sc present. For example, in “A came”, “a” would now receive an additional gig_dep code of subj, since this is the code that would otherwise be instantiated by the missing np.

Note, of course, this may mean that a gig_dep code receives two distinct codes, such as the word “broken” in “A broken came”. This would be coded as premod_subj, since it would normally be marked as premodifying a noun (premod) and since we also need to capture the gig_dep code for the missing np, which here would be subj. Where such a dual coding is required, the additional gig_dep code should be directly linked to the initial gig_dep code via an underscore. Thus, premod + subj becomes coded as premod_subj.

7. Next, you will need to assign the appropriate gig_dep_on codes.

This should be done as normal, the only exception being those codes which would otherwise have been assigned to the missing np/sc. Here, you should simply treat these codes as linking back to whichever element has been signed the replacement np marker. In turn, this element should be coded as linking through to whichever element the missing np/sc would have linked to.

For example in “A came”, the gig_dep_on for “a” would be coded as linking through to “came”, since this is where the missing noun would have linked to.

8. Finally, you should append the required _gram code in the gig_dep column to mark the point(s) at which the anomaly occurs

Thus, in the present example, the _gram code should be appended to the gig_dep code “a” since this is the point at which the anomaly occurs. Note that this may mean that a _gram code is assigned to an otherwise empty gig_dep cell. This is absolutely fine. It may also mean that more than one gig_dep cell is marked with the _gram code. Again, this is fine.

Overall Example

word # word status gig_pos gig_dep gig_dep_on

1 I np Pro subj 2

2 liked

3 a np det dobj_gram 2

4 which np pro_rel subj 5

5 elucidated sc verb_lex_act postmod_fin_rel_gram 3

6 Chomsky np noun_prop dobj 5

Page 22: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  22  

C. Anomalous Combinations Anomolous combinations comprise sequences where nothing is obviously “forgotten” but the specific sequencing remains ungrammatical with respect to the conventions of Standard British English. For example, “7 new more sails”, rather than “7 more new sails”.

Such sequences should be coded by, first, establishing direct dependencies between elements where possible, and, second, seeing whether additional material can be added on the basis of the units thereby established.

Thus, in “Only one way was to try harder”, “one way” is a locally grammatical noun phrase, which then allows for the attachment of “only” on the basis that “only” can modify a noun phrase (e.g. “only this way”).

For example,  

word # word status gig_pos gig_dep gig_dep_on

1 Only adv _gram 3

2 one num _gram 3

3 way np noun_com subj 4

4 was

5 7 num _gram 7

6 new adj premod_gram 8

7 more det _gram 8

8 sails np noun_com predsubj 4

D. Morphosyntactic Errors Such cases should simply coded as if the correct Standard English form had, in fact, been written.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 I np Pro subj 2

2 think

3 Chomsky np noun_prop subj 4

4 are sc verb_lex_act obj_fin_gram 2

5 amazing Adj 4

Page 23: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  23  

E. Dialectical/Non-Standard Forms These should also be coded as if they had been correctly produced according to the conventions of Standard British English.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 I np pro subj 2

2 love

3 books np noun_com dobj 2

4 what np pro_rel subj_gram 5

5 praise sc verb_lex_act postmod_fin_rel 3

6 Chomsky np noun_prop dobj 5

Page 24: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  24  

Specific Coding Guidance

Apposition

Apposition in the present sense covers two specific kinds of dependency - an adverbial clause dependency, and a post-modificatory one.

§ Adverbial Apposition The first kind comprises a relationship between a subordinate clause and the verbal predicate of a preceding clause. More specifically, a subordinate clause should be treated as a case of adverbial appositional where (a) it has the explicit form of a subordinate clause (i.e. it is either non-finite or headed by a preposition/subordinating element), and (b) is such that it effectively re-formulates the preceding verbal predicate.

A good test for this dependency will be the possibility of inserting the phrase “that is” or “namely” in front of the subordinate clause under consideration. Thus, for example, the underlined portion of each of the following would count as instances of apposition:

(a) He wanted to help - to save her from destruction ê

He wanted to help - that is, to save her from destruction

(b) Although she was reluctant, although she was truly hesitant, she still jumped ê

Although she was reluctant, although she was truly hesitant, she still jumped

Such subordinate clauses should be coded as if they were adverbial clauses, with the gig_dep_on number for the sc marker coded as linking back to the head of the preceding verbal predicate. The only exception here is that the gig_dep code for the subordinate clause should begin with specific Primary SC Classifier of adv_app, rather than the standard adv_. The Secondary and Tertiary SC Classifiers should then be assigned as normal.

word # word status gig_pos gig_dep gig_dep_on

1 I np pro subj 2

2 wanted

3 to conj_sub 4

4 help sc verb_lex_act obj_nfin 2

5 -

6 to conj_sub 7

7 see sc verb_lex_act adv_app_nfin 4

8 Chomsky’s np noun_prop_gen det 9

9 enemies np noun_com sc_np 10

10 perish sc verb_lex_act obj_nfin 7

11 horrifically adv 10

Page 25: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  25  

§ Postmodificatory Apposition The second kind of apposition comprises a relationship between an np and a subsequent np/sc that appears to be detached from and yet still serves to further specify this np. Note, however, that this is a distinct relationship from dislocation and extraposition, both of which are specified in separate sections below. Note, also, both dislocation and extraposition are logically “superior” to apposition, and should be chosen over an apposition coding where appropriate.

To count as postmodificatory apposition,

o The subsequent np/sc must be co-referential with or included in the reference of the preceding np.

o In the default case, this will mean that the subsequent np/sc could take the place of the preceding np without affecting the overall grammaticality of the clause at hand. Thus, for example, the following underlined material would count as an instance of apposition since it could replace the preceding np:

Linguists love one thing: stealing Chomsky’s ideas

ê Linguists love stealing Chomsky’s ideas.

o Another useful test is that the subsequent np/sc can often be rephrased in terms of a non-restrictive relative clause (e.g. “Chomsky, (who was) my best friend”) or by prefacing the np/sc with the words “namely” and “that is”. For example,

That is the guy: the fool who Chomsky made cry

ê That is the guy: that is, the fool who Chomsky made cry

All such appositional structures should be coded as if they were postmodifiers of the initial np, using the special gig_dep code detailed in the next paragraph.

For the gig_dep code itself, in the case of an appositional noun phrase, the np marker should be assigned the code of postmod_app. In the case of a subordinate clause, the sc marker should be assigned the specific Primary SC Classifier of postmod_app, rather than the standard postmod_. The Secondary and Tertiary SC Classifiers should then be assigned as normal.

For example,

Page 26: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  26  

word # word status gig_pos gig_dep gig_dep_on

1 Linguistics np noun_com subj 2

2 means 3 two num 4

4 things np noun_com dobj 2

5 stealing sc verb_lex_act postmod_app_nfin 4

6 Chomsky’s np noun_prop_gen det 7

7 ideas np noun_com dobj 5

8 trashing verb_lex_act coord 5

9 Chomsky’s np noun_prop_gen det 10

10 ideas np noun_com dobj 8

Page 27: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  27  

Auxiliary Verbs

Auxiliary verbs are those explicitly classed as either an auxiliary or modal verb by the Macmillan Online English Dictionary. The only exceptions here comprise those word combinations identified here as complex auxiliaries (see the relevant section on complex words below).

Remember, auxiliary verbs only require coding where they are part of an sc. All such auxiliaries should generally receive a gig_pos code of verb_aux and have their gig_dep_on code marked as linking to the lexical verb that instantiates the sc. In the default case, they will also require no gig_dep code.

   

word # word status gig_pos gig_dep gig_dep_on

1 Chomsky np noun_prop subj 2

2 left

3 because conj_sub 6

4 we np pro subj 6

5 were verb_aux 6

6 laughing sc verb_lex_act adv_fin 2

Page 28: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  28  

Clefts

Clefts comprise one of two structures: It-Clefts and Wh-clefts, as set out in the Longman Grammar of Spoken and Written English (cf. LGSWE:958-964).

§ It-Clefts The identifying components of an it-cleft clause are (a) a non-referential “it” in subject position, (b) a “be” form as the clause head, (c) a focused element which may be a noun phrase, a prepositional phrase, an adverb phrase, or an adverbial clause, and (d) a relative-clause like sc introduced by “that”, “who”/”which”, or no subordinator at all (e.g. “It is Chomsky that we love”). Such clauses should be coded as follows.

o Assign the “it” pronoun the gig_pos code of pro_dum, the gig_dep code of subj, and mark its gig_dep_on as linking to the “be” that heads the clause.

o Assign the “be” whatever codes would normally be required according to the overall status of the it-cleft within the wider text.

o Where the focused element is not an np or sc, then assign the element its normal gig_pos code and no gig_dep code. Finally, where a gig_dep_on code is required, mark this as linking to the “be” that heads the it-cleft clause itself.

o Where the focused element is an np, then assign the np marker its normal gig_pos code, assign it the gig_dep code of foc, and mark its gig_dep_on code as linking to the “be” that heads the it-cleft clause itself. All other components of the np should then be coded as normal.

o Where the focused element is an sc, then assign the sc marker its normal gig_pos code, assign it the gig_dep code of foc_fin or foc_nfin as appropriate, and mark its gig_dep_on code as linking to the “be” that heads the it-cleft clause itself. All other components of this sc should then be coded as normal.

o Assign to the relative-like subordinate clause the Status code of sc, the gig_pos code of verb_lex_act, the gig_dep code of obj_fin_cleft or obj_nfin_cleft as appropriate, and mark its gig_dep_on code as linking back to the “be” that heads the it-cleft clause.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 It np pro_dum subj 2

2 was

3 Chomsky np noun_prop foc 2

4 that np pro_rel subj 5

5 destroyed sc verb_lex_act obj_fin_cleft 2

6 Skinner np noun_prop dobj 5

Page 29: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  29  

§ Wh-Clefts/Pseudo-Clefts The main identifying components of a wh-cleft clause are a subordinate clause in subject function that is introduced by a wh-word, a form of “be” as the wh-cleft clause head, and a focused element that takes the form of an np or an sc. (e.g. “What Chomsky did was rock my world”). Such clauses should be coded as follows:

o Code the “wh”-sc as a standard nominal relative clause according to its wider function in the surrounding clause.

o Assign the “be” whatever gig_pos, gig_dep, and gig_dep_on codes would normally be required according to the clause’s status within the wider text.

o Finally, code the focused sc according to its wider function in the surrounding clause as normal.

word # word status gig_pos gig_dep gig_dep_on

1 What np pro dobj 3

2 Chomsky np noun_prop subj 3

3 loved sc verb_lex_act subj_fin_rel 4

4 was

5 rocking sc verb_lex_act obj_nfin 4

6 my det 7

7 world np noun_com dobj 5

Page 30: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  30  

Comparative Phrases & Clauses

Comparative phrases/clauses are those explicitly recognised as such by The Longman Grammar of Spoken and Written English. Where a coding is required, the phrasal/clausal head should be coded according to whether LGSWE classifies it as complementing an adjective (LGSWE:526-9) or an adverb (LGSWE:549-51). If it would be classified as complementing an adjective, code the gig_dep_on number as linking back to the adjective; it if would be classified as complementing an adverb, code the gig_dep_on number as linking back to the adverb.

For comparative clauses, you will also need to assign an appropriate gig_dep code to the sc marker. Again, you should do so using the LGSWE classification. Thus, if the clause complements an adjective, you should assign the Primary SC Classifier of adjmod_; if an adverb, you should code it as advmod_. To this code, you should then add the Secondary SC Classifier of fin_ or nfin as appropriate. Finally, you should add the Tertiary SC Classifier of _comp.

The exception here comprises cases where the comparative phrase/clause is separated by an np head which is also modified by the element which licenses the comparative (e.g. Chomsky is a better linguist than people think”). In such cases, we treat the comparative phrase/clause as dependent on the np marker, such that it should be coded as a postmodifier of the np. Accordingly, where the comparative comprises a preposition phrase, this should be assigned the gig_dep code of postmod_comp; where it is a clause, this should be assigned the gig_dep of either postmod_fin_comp or postmod_nfin_comp as appropriate.

Note, finally, that you do not need to assign any coding to the preceding adjective or adverb that licenses the clause, except where it forms part of a wider np or sc.

word # word status gig_pos gig_dep gig_dep_on

1 Chomsky np noun_prop subj 2

2 is

3 as

4 good

5 as conj_sub 7

6 you np pro subj 7

7 get sc verb_lex_act adjmod_fin_comp 4

word # word status gig_pos gig_dep gig_dep_on

1 Chomsky np pro subj 2

2 probed

3 so adv 4

4 deeply adv 2

5 Skinner np noun_com subj 6

6 cried sc verb_lex_act advmod_fin_comp 4

Page 31: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  31  

Complex Words

A complex word will be any of the following:

• Compounds (e.g. “living room”; “time worn”);

• Complex auxiliaries (i.e. “dare (to)”, “ought (to)”; “used”; “(had) better”, “have to”, “(have) got to”, “be supposed to”, “be going to”);

• Complex determiners (i.e. “a good/great deal of”, “a lot of”, “lots (and lots and lots) of”, “a few (of)” “a good/great many”, “a little”, “plenty of”);

• Complex numerals (e.g. “two-hundred-and-thirty-four”);

• Complex proper nouns (e.g. “The Royal Society for the Protection of Birds”);

• Complex prepositions (e.g. “because of”);

• Complex pronouns (e.g. “each other”);

• Complex subordinators (e.g. “in case”).

All such items are to be treated as single lexical units that are assigned a single part-of-speech. To do so,

1. Assign the overall gig_pos code for that item to each word that makes up the complex; then append the code _complex to each of these codes.

2. Where a gig_dep is required, assign this to the first word in the complex; this is what we treat as the “head” of the complex word.

3. Assign the gig_dep_on number for the “head” word as normal. Then code the gig_dep_on for each subsidiary item within the complex so that the numbers assigned roll back up from the last word in the complex to the first word of the complex.

4. Insert the code extra to the Add_info column for each such of the subsidiary elements.

word # word status gig_pos gig_dep gig_dep_on add_info

1 I np pro subj 2 2 ran

3 in conj_sub_complex 8

4 case conj_sub_complex 3 extra

5 you np pro subj 8

6 had verb_aux_complex 8

7 to verb_aux_complex 6 extra

8 leave sc verb_lex_act adv_fin 2

Page 32: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  32  

Furthermore, you will need to take account of the following information when it comes to the specific subtype of complex word.

§ Compound Common Nouns Noun-Noun compounds will be those common nouns EITHER (a) normally written as one word or hyphenated, OR (b) listed as a single entry within the online Oxford English Dictionary (e.g. “wet suit”, “double-decker”). Such compounds should be coded using the general procedure for complex words, and assigned the overall gig_pos code of noun_com_complex.

Species names (e.g. “black panther”) should also be treated as noun-noun compounds. Such names are all those listed as a known species within Wikipedia. All such names should be coded using the general procedure for complex words above, here being assigned the overall gig_pos code of noun_com_complex

Finally, a common noun sequence should also be treated as a compound where it represents what would normally be a complex proper noun but which is being specifically used in this case as a common noun. As before, only one np should be noted here, being inserted next to first element of the complex. For example,

word # word status gig_pos gig_dep gig_dep_on add_info

1 I np pro subj 2

2 love

3 tree np noun_com_complex dobj 2

4 tops noun_com_complex 3 extra

word # word status gig_pos gig_dep gig_dep_on add_info

1 Blue np noun_com_complex subj 3

2 tits noun_com_complex 1 extra

3 hunt

4 killer np noun_com_complex dobj 3

5 whales noun_com_complex 4 extra

word # word status gig_pos gig_de

p gig_dep_on add_info

1 Noam np noun_com_complex subj 3

2 Chomskys noun_com_complex 1 extra

3 are

4 gods

Page 33: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  33  

§ Other Compounds All other compounds should be coded using the standard conventions for complex words. In this case, however, note that any nouns which form a subsidiary part of these compounds should not be coded with an np in the Status column. Instead, they should be left unmarked here

§ Complex Auxiliaries

Complex auxiliaries comprise instances that are sometimes referred to as marginal auxiliaries and semi-modals.

The specific marginal auxiliaries recognised are: “ought (to)”, “dare (to)”, and “used to”. The specific semi-modals recognised are “(had) better”, “have to”, “(have) got to”, “be supposed to”, and “be going to”.

No other forms are to be treated as complex auxiliaries, even if idiomatically it might feel as if they should. For example, the sequence “be able to”, should be coded as a productive form with all its components individually contributing to the gig_pos and gig_dep relations within a sentence. Particularly tricky cases will be given individual codings as listed within the Supplementary Guidance document that accompanies this manual.

All such verb sequences should be coded using the conventions for complex units defined above, with the gig_pos code for the complex being verb_aux_complex. Accordingly, the following verb will then need to be treated as if it were the clause head that would normally result from the combination of an auxiliary and a lexical verb, with any argument relations then directly depending on this element as normal. This may mean that this verb head would not be given the status code of sc or given the gig_dep code of a non-finite clause as it otherwise might.

For example,

word # word status gig_pos gig_dep gig_dep_on add_info

1 I np pro subj 2

2 love

3 the det 6

4 mist part_complex premod 6

5 veiled part_complex 4 extra

6 sea np noun_com_complex 2

Page 34: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  34  

§ Complex Determiners

A complex determiner is any of the following when accompanied by an np on which they depend: “a good/great deal of”, “a lot of”, “lots (and lots and lots) of”, “a few (of)”, “a good/great many”, “a little”, “plenty of”. All should be coded using the general procedure for complex words outlined above, assigning the overall gig_pos code of det_complex.

§ Complex numerals

A complex numeral is simply a complex number such as “twenty-three” or “four hundred and eighty-two”. All such numbers should be treated using the general procedure for complex words. Note, furthermore, that where this number itself instantiates a noun phrase, then it should be treated as a single noun phrase, with the np marker assigned to the first element of the complex.

word # word status gig_pos gig_dep gig_dep_on add_info

1 I np pro subj 2

2 think

3 I np pro subj 6

4 have verb_aux_complex 6

5 to verb_aux_complex 4 extra

6 follow sc verb_lex_act obj_fin 2

7 Chomsky np noun_prop dobj 6

word # word status gig_pos gig_dep gig_dep_on add_info

1 A det_complex 4

2 great det_complex 1 extra

3 many det_complex 2 extra

4 grammarians np noun_com subj 5

5 ate

6 all det 8

7 the det 8

8 vowels np noun_com_complex dobj 5

Page 35: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  35  

For example,

word # word status gig_pos gig_dep gig_dep_on add_info

1 I np pro subj 2

2 bought

3 two num_complex 7

4 hundred num_complex 3 extra

5 and num_complex 4 extra

6 thirty num_complex 5 extra

7 books np noun_com dobj 2

§ Complex Prepositions

Complex prepositions are all those listed as such within The Longman Grammar of Spoken and Written English (LGSWE: 75-6). All such items should be coded using the standard procedure for complex words, and assigned the overall gig_pos code of prep_complex. Furthermore, the gig_dep_on for the element which depends on the complex preposition coded as linking back to the first word of the complex.

§ Complex Pronouns A complex pronoun will be one of two things. Firstly, it will be any element explicitly recognised above as a complex determiner, but which is not used within the current piece of text as dependent on a following np (e.g. “a few” in “I bought a few”). Secondly, it will be any element which is both counted as a pronoun by the Macmillan online dictionary and which is spelled as two (or more) distinct words.

All such items should be coded using the standard procedure for complex words, being assigned here the overall gig_pos code of pro_complex. Note that, as with all complex

word # word status gig_pos gig_dep gig_dep_on add_info

1 Linguists noun_com subj 5

2 such prep_complex postmod 1

3 as prep_complex 2 extra

4 Chomsky np noun_prop prepobj 2

5 rule

Page 36: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  36  

nominal elements, only one np should be assigned to the compound as a whole, with this added to the first element of the complex pronoun.

word # word status gig_pos gig_dep gig_dep_on

1 Noam np noun_prop subj 4

2 and conj_coord 3

3 Carol np noun_prop subj 4

4 love 5 each np pro_complex dobj 4

6 other pro_complex 5

§ Complex Proper Nouns Complex proper nouns, which include titles alongside the full names of people, places, and organizations, should be treated using the general procedure for complex words, being assigned the overall gig_pos code of noun_prop_complex. Furthermore, you should only assign a single np to the first element of the complex.

Note that we include any definite articles that are inseparable from the proper noun at hand (e.g. “The Royal Society for the Protection of Birds”).

Note that we also count addresses (e.g. “Exeter University, Exeter, EX1 2LU”), dates (e.g. “4th July 2020”), honorifics (e.g. “Lady Macbeth”, “Professor Chomsky”), full personal names (e.g “Uncle Dave”; “Aunt Anna”) as complex proper nouns.

word # word status gig_pos gig_dep gig_dep_

on add_info

1 Noam np noun_prop_complex subj 3

2 Chomsky noun_prop_complex 1 extra

3 rules

4 The np noun_prop_complex dobj 3

5 Royal noun_prop_complex 4 extra

6 Academy noun_prop_complex 5 extra

7 of noun_prop_complex 6 extra

8 Linguists noun_prop_complex 7 extra

Page 37: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  37  

§ Complex Subordinators Complex subordinators are those word combinations recognised as subordinators within LGSWE (cf. LGSWE: 85-6). All such items should be coded using the standard procedure for complex words, being assigned the overall gig_pos code of conj_sub_complex.

For example,

word # word status gig_pos gig_dep gig_dep_on add_info

1 We np pro subj 2

2 cringed

3 in conj_sub_complex 6

4 case conj_sub_complex 3 extra

5 Chomsky np noun_prop subj 6

6 was sc verb_lex_act adv_fin 2

7 mad adj 6

Page 38: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  38  

Coordinators & Coordination

Within the present manual, coordination is treated as distinct from two other functions: ellipsis and gapping. It is also logically prior to these functions, and should be selected over either.

Note, also, that coordination need not be explicitly marked by a coordinator, with “list” structures also counted as coordinations for present purposes. Thus, the underlined noun phrases in “I bought a book and a pen” and “I bought a book, a pen” each represent instances of coordination. The only difference is that the latter has no explicit coordinator that requires explicit marking.

§ Treatment of Simple Coordinators Simple coordinators are all those conjunctions which establish a coordinative relationship between two units, such as “and”, “but”, “or”, “so”. Note that coordinators should only receive a gig_pos code (i.e. conj_coord) and a gig_dep_on code where they either connect an np/sc or are part of a wider np/sc. In the case of correlative coordination (e.g. “neither Chomsky nor Ross”), both coordinators should be assigned the gig_pos code of conj_coord.

In the case of simple coordinators, these should always have their gig_dep_on code marked as linking to the head word of the coordinate they introduce.

Note, furthermore, that coordinators normally never require a gig_dep code.

For the special cases where the word not is used as a coordinator, see the separate section below entitled negation with “not” entitled coordinative “not”.

§ Treatment of Correlative Coordination Where such cases occur, the initial correlative element should simply have its gig_dep_on code marked as linking to the first coordinate, whilst the following coordinator should have it coded as linking to the subsequent coordinate. That is, each should be coded as linking to the specific coordinate they introduce. For example, in “Chomsky loves either Piaget or Skinner”, “either” would be coded as linking to “Piaget”, and “or” to “Skinner”.

Note that in cases of correlative coordination, both elements should be assigned the standard coordinator gig_pos code of conj_coord.

As with their “simple” counterparts, correlative coordinators will also not normally require a gig_dep code.

For the special cases where the word not is used as a correlative coordinator, see the separate section of negation with “not” below entitled coordinative “not”.

§ Treatment of Coordinates Each coordinated element should be assigned their usual gig_pos code. Thus, in the examples above (“I bought a book and a pen” and “I bought a book, a pen”), both “book” and “pen” would receive their usual gig_pos code of noun_com.

Each coordinated element should also be assigned the same gig_dep code. This will be whatever dependency applies to the coordinates considered independently and only where such a code would normally be assigned. Thus, if three premodifiers are coordinated (“The big and blue and beautiful linguist”), each coordinated element should be coded as

Page 39: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  39  

premod, since all cases of premodification require a gig_dep code. Conversely, if two determiners are coordinated, neither would receive a gig_dep code since determiners do not generally require any gig_dep code. Note, however, the case of coordinated verbs and nominals where only the first element in the sequence is marked with an sc/np in the Status column. Here, only the verb marked with an sc/np should receive a distinct gig_dep function, with the remaining coordinated elements coded as coord.

Finally, the gig_dep_on code for each coordinated element should generally be marked as directly linking to the item from which their gig_dep code has been determined. Thus, both “book” and “pen” in “I bought a book and a pen” would have their gig_dep_on code marked as linking them back to “bought” since both would be coded as direct objects of this verb.

The only exception to this gig_dep_on rule are cases where (a) two or more verb phrases have been coordinated under one subject, such that only the first has been marked with an sc, & (b) two or more nouns have been coordinated such that only the first is marked with an np. These cases are described in the Noun Phrase Overview & Subordinate Clause Overview sections above. For all such cases, the coordinated nouns and verbs should simply be coded as linking to the first nouns and verbs in their respective sequences, remembering that no status code is required since we do not count the subsequent nouns and verbs as marking a separate np and sc. Thus, in “I saw him laugh and cry”, “cry” would receive no status code, and have its gig_dep_on marked as linking back to “laugh”. Meanwhile, in “I like these cats and dogs and mice”, “dogs” or “mice” would receive no status code, and would have their gig_dep_on codes marked as linking back to “cats”.

General Coordination Example

word # word status gig_pos gig_dep gig_dep_on

1 All det 5

2 good adj premod 5

3 and conj_coord 4

4 sensible adj premod 5

5 linguists np noun_com subj 6

6 love

7 grammar np noun_com dobj 6

8 and conj_coord 9

9 Chomsky np noun_prop dobj 6

10 passionately

11 and

12 intensely

Page 40: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  40  

Cross-Sentence Dependencies

As you will discover, student punctuation is often be inconsistent with the norms of formal written English. This is entirely to be expected, of course, sine (a) it is precisely these norms which they are developing and (b) the writing context does not require such English.

However, it also means that there will be be instances of dependencies that extend across sentence boundaries. Such cases should basically be coded as if there was no intervening sentence marker. The one exception here is that the gig_dep_on row for that element should be coded by combining the Sentence and Word_number entries for the element on which the current part-of-speech depends. For example, if the first sentence was “Chomsky died” and “Collapsing my world “, then the gig_dep_on code for “collapsing” would be “1_2”.

sentence #

word # word status gig_pos gig_dep gig_dep_on

1 1 Chomsky np noun_prop subj 2

1 2 died

1 3 .

2 1 Collapsing sc verb_lex_act adv_nfin 1_2

2 2 my det 3

2 3 world np noun_com dobj 1

Page 41: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  41  

Determiners

Determiners are essentially all those elements explicitly classed as either a determiner or a predeterminer by the Macmillan Online Dictionary. However, you should also refer to the rules for complex determiners as specified in the relevant section on complex words above).

Note that we make no distinction here between determiner types. Moreover, for the present manual, any part-of-speech identified by Macmillan as a determiner or a predeterminer is simply a determiner for our purpose, receiving the gig_pos prefix of det and each having their own gig_dep_on coded as directly linking to the head of the noun phrase.

Finally, note that whilst determiners should not be assigned a gig_dep code, they should be assigned a gig_dep_on code. This should be marked as linking to the Word_number for the np on which they depend.

word # word status gig_pos gig_dep gig_dep_on

1 The det 2

2 linguist np noun_com subj 3

3 ate

4 all det 6

5 my det 6

6 homework np noun_com dobj 3

Page 42: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  42  

Direct Speech (& Reporting Clauses)

The treatment of direct speech differs according to whether the particular instance takes the form of a noun phrase, a subordinate clause, or some other piece of linguistic material.

§ Direct Speech and Reporting Clauses Before considering these specific cases, however, it should again be noted that reporting clauses are not counted as instances of a subordinate clause within the present manual. As such, they should not generally receive any explicit sc coding.

Reporting clauses are all those clauses explicitly classed as such by The Longman Grammar of Spoken and Written English (cf. LGSWE:196-7). These clauses serve to both accompany and signal instances of direct speech, and are specifically distinguished by their capacity to be displaced around this speech but without affecting its overall grammaticality. Thus, in the example below, “Chomsky shouted” would constitute a reporting clause since it can be displaced so that it appears after the speech:

Chomsky shouted “Run!” → “Run!” Chomsky shouted However, it would not constitute a reporting clause in the following example, since displacing it affects the material’s overall grammaticality:

He heard Chomsky shouted “run!” → *He heard “run!” Chomsky shouted

Direct speech should then be coded according to the more specific rules below.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 Chomsky np pro subj 2

2 shouted

3 “legends!” np noun_com speech 0

4 but

5 everyone np pro subj 6

6 swore

7 that conj_sub 9

8 he np pro subj 9

9 shouted sc verb_lex_act dobj_fin 6

10 “idiots” np speech 9

Page 43: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  43  

§ Noun Phrases as Direct Speech Where the material constitutes a noun phrase, the head of this phrase should (a) be marked as normal with an np in the Status column, (b) receive its normal gig_pos coding, and (c) receive the special gig_dep code of speech. All other parts of the noun phrase should then be coded as usual.

Furthermore, where the noun phrase is not dependent on a wider structure, then the np marker should receive the gig_dep_on code of 0. Where it is introduced as part of a wider structure, the np marker should have its gig_dep_on code marked as linking to the element that instantiates that structure.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 I np pro subj 2

2 laughed

3 because conj_sub 5

4 Chomsky np noun_prop subj 5

5 shouted sc verb_lex_act adv_fin 2

6 “the det 7

7 fools!” np noun_com speech 5

Page 44: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  44  

§ Clauses as Direct Speech Direct speech in the form of a clause should be treated as a subordinate clause in only two cases. Firstly, where it takes the form of a subordinate clause that is not itself dependent on a wider structure (“oh that I should die happy, Chomsky shouted”). Secondly, where it constitutes a clause that is introduced by material that cannot be construed as a reporting clause (“I heard him shout I want Skinner to die miserably”).

In the first case, the head of the speech clause should (a) be coded as normal with an sc marker, (b) receive its normal gig_pos code, and (c) receive the special gig_dep code of speech_fin or speech_nfin according to whether it is finite or non-finite. It should also be assigned the gig_dep_on code of 0. All other parts of the sc should be coded as normal.

In the second case, the head of the speech clause should (a) be marked with an sc in the Status column, (b) receive its normal gig_pos code, and (c) receive the special gig_dep code of speech_fin or speech_nfin according to whether it is finite or non-finite. Furthermore, it should have its gig_dep_on marked as linking to the element on which it is dependent. All other parts of this sc should then be coded as if were a normal subordinate clause.

word # word status gig_pos gig_dep gig_dep_on

1 I np pro subj 2

2 adore

3 people np noun_com dobj 2

4 who pro_rel subj 5

5 shout sc verb_lex_act postmod_fin_rel 3

6 “Chomsky np noun_prop subj 7

7 wins!” sc verb_lex_act speech_fin 5

Page 45: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  45  

§ Other Material as Direct Speech No other direct speech material will require coding except where it also forms part of a wider np or sc. Where it is so dependent, then this material should (a) receive its normal gig_pos code, (b) receive its usual gig_dep coding, and (c) have its gig_dep_on code marked as linking to whichever element the direct speech is dependent on.

word # word status gig_pos gig_dep gig_dep_on

1 I np pro subj 2

2 laughed

3 because conj_sub 5

4 Chomsky np noun_prop subj 5

5 shouted sc verb_lex_act adv_fin 2

6 “Agghh!” int 5

Page 46: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  46  

Discontinuous Structures

Occasionally you will encounter instances of discontinuous dependents of an np or sc. That is, material which overall constitutes one unit, but which has been separated from the np/sc in the construction of the sentence. For example, although separated in terms of their sequencing, the underlined material in “Rumour spread through the clique that Chomsky was losing”, would otherwise constitute a single np: “Rumours that Chomsky was losing”.

Such material should be treated as if it were one contiguous structure. To do so, you should simply code each element as normal, making sure that the gig_dep_on code instantiates the appropriate dependency relationship; in effect, reaching over the intervening material.

word # word status gig_pos gig_dep gig_dep_on

1 Rumous np noun_com subj 2

2 spread

3 through

4 the det 5

5 clique np noun_com prepobj 3

6 that conj_sub 8

7 Chomsky np noun_prop subj 8

8 lost sc verb_lex_act postmod_fin 1

Page 47: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  47  

Dislocations

Dislocated material refers to any np that appears in a peripheral position at the edge of a clause and which is co-referential with a pronoun that appears elsewhere in this same clause. For example, “That book, I hate it”.

This material should be coded as normal, except that the gig_dep code for the head of the np will be disl and the gig_dep_on will be coded as being identical to that of the np with which it co-refers.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 Chomsky np noun_prop disl 3

2 he np pro subj 3

3 plays

4 dirty

Page 48: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  48  

Ellipsis

Ellipsis covers instances where otherwise obligatory linguistic material has been deliberately omitted without making the remaining material ungrammatical.

Note that ellipsis is both distinct from coordination, and treated such that a coordination analysis should always take precedence over an ellipsis analysis. Thus, for example, you should code the underlined material in “I thought Chomsky would and could love me” as a case of two coordinated auxiliary verbs, rather than two ellipted verb phrases. That is, it is not an ellipted version of “I thought Chomsky would love me and could love me”.

It is also distinct from instances of gapping, as specified in the relevant section below. Thus, whilst the underlined material in “John bought a book but I didn’t” constitutes a case of ellipsis, the underlined material in “John bought a book and James a pen” constitutes a case of gapping.

Most importantly, words within a sentence will only require an explicit ellipsis coding where (a) they are dependent on a piece of ellipted material, and (b) the ellipted material would itself require explicit coding were it actually present. Thus, the underlined material in “I know you loved Chomsky and I didn’t” would require an explicit ellipsis coding since it depends on a missing verb that would otherwise be coded as instantiating a subordinate clause.

All such cases should be coded on the model of “forgotten” material, as specified in the grammatical anomalies section above. There are only two exceptions here:

a. You should not use the _gram code. Instead, cases of ellipsis should be marked by appending the code _ellipsis to the gig_dep cell for the particular element which marks the ellipsis at hand.

b. You should take special note of cases where an ellipted subordinate clause is marked by a residual auxiliary verb. For example, the underlined “is” in “I think Skinner isn’t laughing but Chomsky is”. Where such material occurs, the default rule is to assign the replacement sc coding to this auxiliary verb, even where it is followed by some residual material that might otherwise be taken as marking the ellipsis.

Thus, the underlined “is” in “I think Chomsky is laughing but Skinner is not”, would receive the Status code of sc, the gig_pos code of verb_aux, the gig_dep code of obj_fin_ellipsis, and have its gig_dep_on marked as linking to “think”. In turn, the “not” would receive no Status or gig_dep code, but be assigned the gig_pos code of adv and have its gig_dep_on marked as linking back to the “is” that has been assigned the replacement sc marker.

Finally, we do not count the isolated pro-form “so” as marking a case of ellipsis, as in “No, I don’t think so” (See LGSWE:72). This should simply be coded as if it were a normal adverb. This in contrast to the pro-form “do so”, as in “I don’t think he did so”. Here, however, it is the “do” auxiliary that marks the ellipsis, with the “so” still being coded as a normal adverb.

Page 49: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  49  

Example

word # Word status gig_pos gig_dep gig_dep_on

1 I np pro subj 2

2 know

3 he np pro subj 4

4 loves sc verb_lex_act obj_fin 2

5 chomsky np noun_prop dobj 4

6 and conj_coord 9

7 that conj_sub 9

8 you np pro subj 9

9 do sc verb_aux obj_fin_ellipsis 2

10 too adv 9

 

Page 50: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  50  

Existential Clauses

Existential clauses are all those classified as such within The Longman Grammar of Spoken and Written English (cf. LGSWE: 943-956); in other words, any clause which has existential “there” as its subject (e.g. “There was a man in the pub”; “there is a linguist sitting on my chair”). Such material should be coded as follows.

Firstly, regarding the “there”, assign an np marker in the Status column for this word, assign it the gig_pos code of pro_dum, and assign both its gig_dep and gig_dep_on codes as normal.

Secondly, assign an np marker to the noun phrase that follows the head of the clause. Then assign this np the special gig_dep code of subj_not, and code its gig_dep_on as linking back to the clause head.

Finally, code any remaining parts of the clause as normal.

word # word status gig_pos gig_dep gig_dep_on

1 There np pro_dum subj 2

2 was

3 clear adj premod 4

4 sky np noun_com subj_not 2

5 above

6 us np pro prepobj 5

Page 51: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  51  

Extraposed Clauses

Extraposition comprises instances where a subordinate clause appears outside of a core argument position but is also exchangeable with a non-referential “it” pronoun that occupies one of these positions in the surrounding material [cf. LGSWE: 155].

The sc marker for such clauses should be coded using the extr_ code specified in the gig_dep code list below, combining this code with the relevant Secondary and Tertiary SC Subclassifiers as normal (e.g. extr_fin). Furthermore, the gig_dep_on for the extraposed sc should be coded as linking back to the sc marker for the clause. Finally, this “it” pronoun with which the sc is exchangeable should receive the special gig_pos code of pro_dum.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 It np pro_dum subj 2

2 is

3 obvious

4 that conj_sub 7

5 Chomsky np noun_prop subj 7

6 was verb_aux 7

7 manipulated sc verb_lex_pass extr_fin 2

Page 52: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  52  

Fronted Material

Fronted material refers to material which has EITHER been displaced from its canonical position so that it appears before the clause subject OR which is simply an adverbial that appears before the clause subject rather than in a latter position. For example, the underlined material in “Now, Chomsky I do love” and “If you ask me a million times, the answer will always be Chomsky”.

Such material should only be coded as being “fronted” where it would normally be assigned a gig_dep code. Where this is the case, all codes should be assigned as normal, except that the suffix _fronted should be appended to the gig_dep code for the element that heads the material.

Note, however, just as is the case with subject-verb inversions, a fronted material coding only applies to declarative structures. Interrogative or exclamative structures require no special _fronted coding.

word # word status gig_pos gig_dep gig_dep_on

1 This det 2

2 morning np noun_com adv_fronted 6

3 without

4 hesitation np noun_com prepobj 3

5 Chomsky np noun_prop subj 6

6 prevailed

Page 53: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  53  

Gapping

Gapping comprises material that would otherwise be coded as arguments of a coordinated verb phrase, except for the fact that the required verbal material has been deliberately “gapped” in order to make the sentence more cohesive. For example, the underlined material in “I think Chomsky likes apples and Skinner oranges ”.

All such material should be coded as follows.

o Firstly, do not treat the gapped material as itself requiring an sc marker, even where this material would otherwise be the case. As such, you will never need to assign an additional sc marker to account for this material. Moreover, you will not need to assign the gig_dep code that would normally be required to mark the external dependency of the gapped clause.

o Secondly, assign the standard Status markers and gig_pos codes to all of the remaining material, exactly as if the omitted verb were present. This includes any material that would otherwise receive an np or sc marking due to their direct dependence on the omitted verb. Thus, in the preceding example, you would still assign an np marker to “Skinner” and “oranges”, since that is what you would do were the gapped verb “likes” actually present

o Thirdly, assign the standard gig_dep codes to any remaining material exactly as if the omitted verbal material were present. The only exception here comprises any np/sc markers which would otherwise be coded as directly dependent on the missing verb. Whilst these should again be assigned their standard gig_dep codes, you will also need to append to this annotation the special code _gap. Thus, in the current example, “Skinner” would receive the gig_dep code of subj_gap and “oranges” the gig_dep code of dobj_gap, to reflect the fact that these remain the subj and dobj of the gapped verb “likes”.

o Finally, assign the various gig_dep_on codes to the remaining material exactly as you would if the gapped verb were actually present. The only exceptions here are (a) all the elements that would otherwise directly depend on the omitted verb, and (b) the coordinator used to introduce the gapped material.

o In the case of any direct dependents, these should simply be coded as linking back to the head of the clause with which the gapped material has been coordinated. Thus, for example, in the preceding, both “Skinner” and “oranges” would each have their gig_dep_on code marked as linking back to “likes”.

o In the case of the coordinator, where present, this should simply be linked to the first word within the gapped material that functions as an argument of the omitted verbal material. Thus, in the preceding example, the “and” would be marked as linking to “Skinner”, since this is the head of the np that would otherwise function as the direct argument of the omitted verb “likes”.

For example,

Page 54: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  54  

word # word status gig_pos gig_dep gig_dep_on

1 I np pro_pers subj 2

2 think

3 Chomsky np noun_prop subj 4

4 likes sc verb_lex_act obj_fin 2

5 apples np noun_com subj 4

6 and conj_coord 7

7 Skinner np noun_prop subj_gap 4

8 oranges np noun_com dobj_gap 4

Page 55: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  55  

Genitival NPs

Genitival noun phrases comprise any instance of a noun phrase instantiated by a common/proper noun that either has or should have been marked with a possessive apostrophe (e.g. “Chomsky’s book”, “a linguist’s obligation”; “the students’ dilemma”).

There are two types of genitival np as detailed in the Longman Grammar of Spoken and Written English: specifying genitives, and classifying genitives (cf. LGSWE: 294-295).

In most cases, the genitival noun phrases you encounter will be specifying genitives. Such cases function as the determiner of the subsequent np, with the whole genitival noun phrase thereby replacing any determiners that this np would normally require.

However, you may also encounter so-called classifying genitives. Such instances instead function as modifiers of the subsequent np. They thereby come between this np and any determiners that it would normally require.

Whilst distinguishing between specifying and classifying genitives can often be difficult, a core marker is whether or not the genitival noun is followed by another modifier or not. If it is, as the word “linguist’s” is in “the beautiful linguist’s disgusting foot”, then you should treat the genitival noun as a specifying genitive.

Another useful test is to consider which nouns an adjective would apply to, either where such an adjective is actually present or where you imagine such an adjective to be present. If the adjective can be treated as modifying the non-genitival noun, then you should treat the genitival noun as classifying. If instead, the adjective can be treated as modifying the genitival noun, then you should treat this noun as specifying.

For example, in “the nasty ship’s cat”, the word “nasty” seems to modify “cat”; that is, it is the cat that is nasty, not the ship. This makes “ship’s” a classifying genitive. Accordingly, you would treat “ship’s” as a single word np that modifies cat, with both “the” and “nasty” also depending on “cat” as normal.

However, we acknowledge that such cases can be very difficult to decide. Accordingly, where in doubt, you should simply code genitival noun phrases as if they were specifying genitives, noting the ambiguity in the add_info column as appropriate

Having decided, which type of genitive is present, you should code as follows. For both types,

o Firstly, assign this noun the gig_pos code of noun_com_gen or noun_prop_gen if it represents a simple common or proper noun; or noun_com_complex_gen or noun_prop_complex_gen if a complex noun.

o Secondly, assign the gig_dep_on code for this element as linking to the subsequent np on which it depends.

Next, for specifying genitives,

o Assign the genitival noun the status code of np

o Assign the genitival noun the special gig_dep code of det.

o Identify any preceding dependents of the genitival noun, and code them as if they were normal dependents, with their gig_dep_on codes all linking to this genitival noun.

Page 56: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  56  

Conversely, for classifying genitives,

o Assign, the genitival noun the standard gig_dep code of premod.

o Code any preceding dependents as normal, with their gig_dep_on codes instead “hopping” over the genitival noun, so that they instead link to the np on which the genitival noun itself depends.

Note, however, that your csv files will have arranged the possessive apostrophe so that it receives its own line as part of the automatic parsing process. You should simply ignore this line and leave it blank. All the relevant information will have been coded in the other rows.

For example,

word # word status gig_pos gig_dep gig_dep_on add_info

1 The det 2

2 cat np noun_com subj 3

3 ate

4 the det 6

5 greedy adj premod 6

6 captain noun_com_gen det 8

7 's

8 food np noun_com dobj 3

Page 57: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  57  

Gerunds

There is no specific code required for gerunds. These forms should simply be coded as if they were common nouns. (See also the section on participle forms below).

word # word status gig_pos gig_dep gig_dep_on

1 The det 2

2 questioning np noun_com subj 5

3 of prep postmod 2

4 Chomsky np noun_prop prepobj 3

5 is

6 a det 8

7 pointless adj premod 8

8 game np noun_com predsubj 5

Page 58: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  58  

Independent NPs and SCs

This material comprises instances of noun phrases and subordinate clauses which cannot be construed as being dependent on any wider sequence, such that they cannot clearly be assigned any of the normal clause functions. For example, the underlined phrases in: “Everything is gone. No hope. No love. Some truth.”

All such cases should be assigned their normal gig_pos code, be given a gig_dep code of head, and have their gig_dep_on marked as 0.

sentence # word # word status gig_pos gig_dep gig_dep_on

1 1 Everything np pro subj 2

1 2 is

1 3 gone

1 .

2 1 No det 2

2 2 hope np noun_com head 0

2 .

3 1 No det 2

3 2 love np noun_com head 0

3 .

4 1 A det 3

4 2 brutal adj premod 3

4 3 truth np noun_com head 0

4 4 .

Page 59: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  59  

Negation with “Not”

There is no specific gig_pos code for “not”. Instead, we code this as one of two forms: an adverb, which we code with the standard gig_pos code of adv; and a special coordinative use, which we code with the standard gig_pos code of conj_coord. Both cases are described below.

§ “Not” as Adverb As an adverb, “not” functions in the normal ways an adverb does. It can also depend on an np in cases where it directly precedes the determiner of this np and this np is not coordinated (e.g. “Not a single linguist believes Skinner”, “He left with not a moment’s hesitation”).

In all such cases where “not” works as an adverb, it should simply be coded with the standard gig_pos code of adv. Moreover, as an adverb, it generally receives no gig_dep coding, and should simply have its gig_dep_on code marked as linking to the most appropriate word in that context. This same coding should also be applied to the <n&apos;t> which represents our parser’s automated “translation” of the contracted form n’t.

Example

word # word status gig_pos gig_dep gig_dep_on

1 Not adv 4

2 a det 4

3 single adj premod 4

4 linguist np noun_com subj 5

5 laughed

§ Adverb “Not” as Clause Negator The most common function of the adverb form of “not” is to negate a clause. This will normally be indicated either by its appearing in the contracted form n’t (e.g. “I don’t believe you”; “I saw he wasn’t ready”) or by its appearing in a context such that you can replace both the “not” and the preceding verb with a contracted form of the verb (e.g. “I saw he was not ready” --> “I saw he wasn’t ready”).

In such cases of clausal negation, the default case is to code the gig_dep_on for the “not” as linking to the verb with which it can contract. However, you will occasionally come across cases where the “not” should link to the following verb instead. This happens when linking it to the auxiliary verb would change the meaning of the clause. Take, for example, the sentence “You can always not read Chomsky”. This does not mean that the person “cannot” read Chomsky, but that they have the option of “not reading” Chomsky. Such cases will be unusual, but where they do occur, you should assign the gig_dep_on code as linking to the following verb, rather than the auxiliary verb with which it can contract.

For example,

Page 60: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  60  

word # word Status gig_pos gig_dep gig_dep_on

1 I np pro subj 2

2 believe

3 Chomsky np noun_prop subj 6

4 does verb_aux 8

5 not/ n&apos;t adv 4

6 love sc verb_lex_act obj_fin 2

7 me np pro dobj 6

§ Coordinative “Not” A special use of “not” is where it effectively functions as a phrase-level coordinator. In such cases, this “not” will be (a) a correlative coordinator that introduces the first element of a coordination (e.g. “He wanted not fruit but meat for dinner”), (b) a word that is itself directly introduced by a core coordinator (e.g. “He wanted meat for dinner, but not fruit”), or (c) a word that appears by itself yet functions as if it could have been directly introduced by a core coordinator (e.g. He wanted meat, not fruit for dinner”).

Where it appears as part of a wider np/sc, all such cases should be marked with the special gig_pos code of adv_coord, should receive no gig_dep coding, and have their gig_dep_on coded as linking to the coordinated element it introduces. Thus, in the case of the following np coordinations, the first example would have the “not” linking to “burgers”, whilst the second would have it linking to “chips”):

(1) “I wanted not burgers but bread” (2) “I wanted water not chips”)

Similarly, where accompanied by a core coordinator, such as “but” or “and”, this coordinator should also be coded with its gig_dep_on independently linking to whichever coordinate it introduces.

Furthermore, where a coordinative “not” is followed by another adverb such as “only” or “merely”, this accompanying adverb should be coded with the gig_pos of adv, receive no gig_dep code, and have its gig_dep_on marked as linking back to the coordinative “not”.

Finally, you should code all the remaining elements of each coordinate as if they were a normal coordinated sequence as described in the section on coordinators and coordination above.

Page 61: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  61  

General Coordinative “Not” Example

word # word status gig_pos gig_dep gig_dep_on

1 Chomsky np pro subj 2

2 was

3 a det 4

4 god np noun_com predsubj 2

5 and conj_coord 9

6 not adv_coord 9

7 merely adv 6

8 a det 9

9 linguist np noun_com predsubj 2

Page 62: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  62  

Nominal Relatives

Although some grammars treat such material as simultaneously instantiating both a noun phrase and a subordinate clause, we count them as merely instantiating a subordinate clause. Accordingly they should be coded using the basic subordinate clause template, with an sc assigned to the head of the clause, and all remaining gig_pos, gig_dep, and gig_dep_on codes assigned as normal.

The only exception here concerns the gig_dep code for the sc head of the clause, which should be coded as follows

o Firstly, this head should receive the Primary SC classifier that matches its overall function within the wider structure.

o Secondly, it should receive the appropriate Secondary SC Classifier according to whether it is finite or non-finite.

o Finally, it should receive the Tertiary SC Classifier of _rel.

o Thus, the following underlined clause would receive the gig_dep code of prepobj_fin_rel since it is a finite nominal relative acting as the complement of a preposition “I am fine with what Chomsky did”.

o On the other hand, the following underlined clause would receive the gig_dep code of subj_fin_rel since it is a finite nominal relative acting as the subject of another clause: “Whoever loves Chomsky can do no wrong”.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 Chomsky np noun_prop subj 3

2 can

3 eat

4 what np pro_rel dobj 6

5 he np pro subj 6

6 likes sc verb_lex_act obj_fin_rel 3

Page 63: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  63  

Quotations

Quotations comprise any material which you think has been explicitly cited (i.e. “quoted”) from another text, as in the underlined portion of “Martin Luther King said I have a dream).

Such passages should not be coded as normal. Instead, you should insert the code quotation in the Notes column for each word that comprises the quotation.

Furthermore, where this material is not dependent on a wider np/sc, then all other cells within the Status, Gig_pos, Gig_dep, and Gig_dep_on columns should simply be left blank.

Where this material is dependent on a wider np/sc, however, all of these cells should be left blank except for the head word of the quotation. This should receive no Status or Gig_dep code, but should receive the Gig_pos code of quot, and have their Gig_dep_on marked as linking to the relevant part of the wider np/sc. You should also delete the quotation code for this cell from the Notes column.

Finally, where other material external to the quotation is present, such that it directly depends on the head word as if it were an independent np/sc, then you should code this head word as follows. Firstly, code any of this external material as normal. Secondly, the head word of the quotation should receive the Status code of np or sc as appropriate, the Gig_pos of quot, and have its Gig_dep and Gig_dep_on codes as normal according to the textual context at hand. All other cells within these columns should be left blank, and the quotation code for the head word should also be deleted from the Notes column.

For example,

word # word status gig_pos gig_dep gig_dep_

on add_info notes

1 The det 2

2 father np noun_com subj 3

3 calls

4 him np pro dobj 3

5 a det 8 quotation

6 `` quotation

7 good quotation

8 man np quot predobj 3

9 &apos;&apos quotation

Page 64: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  64  

Participle Forms

All participles that instantiate a subordinate clause should be treated as normal lexical verbs with respect to their gig_pos code. That is, they should be coded as either verb_lex_act or verb_lex_pass, according to whether they are used in their active or passive sense.

Conversely, all participle forms that premodify a noun should receive the special gig_pos code of part.

Note that we do not count gerunds as participle forms; these are simply common nouns that receive the gig_pos code of noun_com. See the section on gerunds above.

word # word status gig_pos gig_dep gig_dep_on

1 Questioning sc verb_lex_act subj_nfin 3

2 Chomsky np noun_prop dobj 1

3 is

4 a det 8

5 broken part 8

6 but conj_coord 7

7 unending part 8

8 game np noun_com predsubj 3

Page 65: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  65  

Particles

Particles are a special class of adverbs, whose forms are identical with prepositions. They can be identified through their being word forms which Macmillan lists as both adverbs and prepositions, but which it specifically class as adverbs relative to the context at hand. For example, “up” is a particle in “I looked the answer up” and “I looked up the answer”, since Macmillan identifies the word as an adverb here. However, “up” as a preposition in “He climbed up the steps”, since Macmillan identifies such words as prepositions where they combine with a noun to form a single integrated phrase.

All such instances should be coded as if they were adverbs, but using the special gig_pos code of prt.

For example,

 word # word status gig_pos gig_dep gig_dep_on

1 The det 2

2 man np noun_com subj 8

3 who np pro_rel subj 4

4 looked sc verb_lex_act postmod_fin_rel 2

5 up prt 4

6 the det 7

7 answer np noun_com dobj 4

8 laughed

9 at

10 the det 11

11 linguist np noun_com prepobj 9

12 down prep postmod 11

13 the det 14

14 corridor np noun_com prepobj 12

   

Page 66: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  66  

Particle + Adverb/Preposition Sequences

This covers cases where an np/sc contains a sequence comprising a particle form that is directly followed by either an adverb or a preposition phrase (e.g. “up there”, “down through the rain”).

Where such material occurs, check to see if the particle form can be treated as distinct from the adverb/preposition phrase. This will be the case where it is either (a) listed by Macmillan as part of a phrasal verb, or (b) it can move around independently of the adverb/preposition phrase. If so, then assign this particle form the gig_pos code of prt, assign it no gig_dep code, and mark its gig_dep_on as linking back to the np/sc on which it depends.

Conversely, if the particle form cannot be treated as distinct, proceed as follows. Firstly, where the particle form is followed by a preposition phrase, then:

o If the particle form + preposition is listed as a complex preposition by LGSWE (pp.75-76), code both words as a complex preposition using the standard rules for such words.

o If the particle form + preposition is not listed as a complex preposition, assign the particle form the gig_pos code of prt, assign it no gig_dep code, and code its gig_dep_on as linking to the preposition. You should then code the preposition as appropriate according to its function within the context at hand.

Secondly, where the particle form is followed by an adverb, then:

o If the adverb can be exchanged for a noun phrase such that the particle form + noun phrase becomes a preposition phrase, then assign the particle form the gig_pos code of prep, mark its gig_dep_on as linking back to the relevant element, and code its gig_dep as appropriate to the context at hand. You should then assign the adverb the gig_pos code of adv, assign it no gig_dep code, and mark its gig_dep_on as linking back to the particle form.

o If the particle form cannot be so exchanged, assign it the gig_pos code of prt, assign it no gig_dep code, and mark its gig_dep_on as linking to the adverb that follows. You should then assign this adverb the gig_pos code of adv, assign it no gig_dep code, and mark its gig_dep_on as appropriate according to the wider context.

 word # word status gig_pos gig_dep gig_dep_on

1 Chomsky np noun_prop subj 2

2 shot

3 the det 4

4 man np noun_com dobj 2

5 who np pro_rel subj 6

6 was sc verb_lex_act postmod_fin_rel 4

7 down prt 8

8 in prep 6

9 the dete 10

10 basement np det prepobj 8

Page 67: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  67  

Passsives

Any subordinate clause headed by a passive verb should simply be coded as if it were a normal clause. The only exception concerns the gig_pos code for the verbal head of the clause. This should receive the gig_pos code of verb_lex_pass, in order to distinguish it from the active form of the verb (which should receive the gig_pos code of verb_lex_pass).

Furthermore, where a “by” preposition phrase has been appended to the clause, with the np complement of this preposition corresponding to the subject of the clause’s active voice equivalent, then assign the special suffix of _agent to the complement’s standard gig_pos code.

Finally, you should also note that we also recognise so-called “get” passives as instances of passivization (e.g. He got killed). Accordingly, you should count the “get” that introduces such verbs as an auxiliary verb, thereby receiving the gig_pos code of verb_aux but no gig_dep code.

word # word status gig_pos gig_dep gig_dep_on add_info

1 Everyone np pro subj 2

2 thought

3 Skinner np noun_prop subj 4

4 was/got verb_aux 5

5 vanquished sc verb_lex_pass obj_fin 2

6 by prep 5

7 Professor np noun_prop_complex_agent prepobj 6

8 Chomsky noun_prop_complex_agent 7 extra

Page 68: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  68  

Prepositions

Prepositions are essentially all units explicitly classed as such by the Macmillan online dictionary. The only exceptions here are those word combinations identified here as complex prepositions as described in the section on complex words above.

Note, also, that prepositions should only receive a gig_pos code where they are part of a wider np or sc. Furthermore, whilst each such preposition should also receive a gig_dep_on code, they should only receive a gig_dep code where they postmodify an np.

word # word status gig_pos gig_dep gig_dep_on

1 Linguists np noun_com subj 4

2 from prep postmod 1

3 MIT np noun_prop prepobj 2

4 write

5 with

6 gusto np noun_com prepobj 5

Page 69: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  69  

Prepositional Verbs

Prepositional verbs comprise a particular subset of verbs; namely, those that take a prepositional phrase as one of their arguments (e.g. “I rely on him”, “I thought about everything Chomsky said”). Such preposition phrases are often given competing accounts, with some grammar treating the verb and preposition as forming a complex word. We adopt no such analysis.

Accordingly, you should treat all such preposition phrases as distinct units that simply function as adverbial dependents of the verb. This means that the preposition should receive its normal gig_pos code, receive no gig_dep code, and have its gig_dep_on code marked as linking back to the relevant verb. Meanwhile the remainder of the preposition phrase should be treated as normal, with the head(s) of this material having its gig_dep_on code marked as linking back to the preposition.

word # word status gig_pos gig_dep gig_dep_on

1 I np pro subj 2

2 think

3 everyone np pro subj 4

4 relies sc verb_lex_act obj_fin 2

5 on C prep 4

6 Chomsky np noun_com_gen det 8

7 ‘s

8 ideas np Noun_com prepobj 5

Page 70: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  70  

Proper (vs Common) Nouns

You should code a word as a proper noun where it is used within the specific context as the name of a specific entity, such that it normally cannot vary for number or definiteness (cf. LGSWE: 247). Otherwise, you should code it as a common noun. The gig_pos code for a proper noun is noun_prop, and the gig_pos code for a common noun is noun_com. For example,

o I like America = Proper Noun

o There should be an America on every continent = Common Noun

o I saw Lord Quirk this morning = Proper Noun

o I saw three wonderful Lord Quirks this morning = Common Noun

For further information relating to complex nouns, whether common or proper, see the relevant section of complex words above.

Page 71: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  71  

Subject-Verb Inversions

Subject-Verb Inversions comprise material where an np or sc functioning as the normal subject of a clause has been “inverted”, such that it now appears after the verb. Such material should simply be coded as normal, with the exception that you should append the gig_dep code of _inv to the standard subj code.

Note, however, just as is the case with fronted material, a subj_inv coding applies only to declarative structures. Interrogative or exclamative structures require no special _inv coding.

word # word status gig_pos gig_dep gig_dep_on

1 Here

2 was

3 Chomsky np noun_prop subj_inv 2

Page 72: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  72  

Subordinators & Relativizers

§ Subordinators Subordinators are all those words which are both identified by the Macmillan Online Dictionary as conjunctions and which establish a subordinate relationship between a clause and a surrounding unit, such as “because”, “if” and “although”.

Note that they include the word “that” as used to introduce a complement or comparative clause (e.g. “I think that Chomsky is right”, “The idea that Chomsky is right is unquestionable”, “Chomsky is so good that he is never wrong”).

They also include the word “to” as used to introduce an infinitive clause (e.g. “I want to read more Chomsky”).

In addition, we also recognize complex subordinators as set out in the relevant section on complex words, and which are all those word combinations recognised as such in LGSWE:824-4.

Unless serving to introduce a verbless clause (see below), all such items should receive a gig_pos code of conj_sub, and have their gig_dep_on code marked as linking to that part-of-speech which has been explicitly marked with an sc in the Status column. In the general case, subordinators should not receive any gig_dep coding.

§ Relativizers Relativizers are the special “relative” words that serve to both introduce a relative clause and which mark the omitted element within this clause (e.g. “the linguist that cried”, “the time when Chomsky was wrong”). All such words should receive the appropriate gig_pos code as identified by the Macmillan Online Dictionary, with the Secondary Gig_pos Classifier of _rel then appended to this code (e.g. pro_rel, det_rel, or adv_rel).

Furthermore, the relativizer should have their gig_dep code marked according to whatever coding the omitted element would receive were it present (e.g. subj, dobj, adv). You should also note that, where the relativizer is a relative pronoun, then this word should also have an np assigned to it in the Status column.

Finally, unless they constitute a determiner, relativizers should have their gig_dep_on code marked as linking back to the sc that instantiates the relative clause. Where they do constitute such a determiner, their gig_dep_on code should be marked as linking to the np on which they depend.

For example,

Page 73: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  73  

 

word # word status gig_pos gig_dep gig_dep_on

1 Chomsky np noun_prop subj 2

2 destroys

3 books np noun_com dobj 2

4 which np pro_rel dobj 6

5 he np pro subj 6

6 criticised sc verb_lex_act postmod_fin_rel 3

7 because conj_sub 9

8 he np pro subj 9

9 is sc verb_lex_act adv_fin 2

10 god np noun_com predsubj 9

Page 74: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  74  

Tag Clauses

Tag clauses constitute a “truncated” clause headed by an auxiliary verb + subject pronoun, which is appended to the edge of a surrounding clause, and which has either an interrogative sense (e.g. “I look terrible, don’t I?”) or a declarative sense (e.g. “He is alright, he is”), with both senses seeking to affirm the content of the clause to which they are attached.

All such clauses should be treated as a subordinate clause, with the auxiliary verb marked as sc, given a gig_pos code of verb_aux, a gig_dep code of adv_fin_tag, and have its GiG_dep_on code marked as linking to the head of the surrounding clause to which it is attached. The pronoun should then be treated as if it was the subject of the auxiliary verb.

Note that such tag clauses never count as instances of either ellipsis or gapping.

word # word status gig_pos gig_dep gig_dep_on

1 Chomksy np noun_prop disl 2

2 he np pro subj 3

3 is

4 amazing

5 isn’t sc verb_aux adv_fin_tag 3

6 he np pro subj 5

Page 75: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  75  

Verbless Clauses

Although often treated as instances of subordination, we do not explicitly recognise verbless clauses as subordinate clauses. Accordingly, you should not assign an sc marker to this material nor assign them an explicit sc coding.

However, since they often contain an np and since this material will also need to be linked to a wider clause where present, you should annotate this material according to the following two types of so-called “verbless” clauses.

Note, moreover, that only the following two types are to be treated as part of a “verbless clause”. All other noun phrases which appear in a peripheral position, but which you still take to be dependent on a surrounding clause, should be coded according to the general information provided in the rest of the manual above, using the general gig_dep codes set out in appendix II. In the default case, this will mean simply treating the dependent np as an adverbial, and coding it as such (i.e. with the gig_dep code of adv).

It also means that you should not code an isolated np as part of a verbless clause unless it appears in one of the two types of clause specified below. If it does not, but you still feel the np effectively has the function of a subject or predicate, you should simply make use of the coding rules for independent nps and scs as outlined in the relevant section above. In other words, you should assign the np the gig_dep code of head, and mark its gig_dep_on code as 0.

§ Type I Verbless Clauses Type I verbless clauses cover cases where a subordinating conjunction directly introduces material that essentially functions as the predicative complement of a “be” verb but where this “be” is absent (e.g. “when happy”, “although a great linguist”).

Where present, and where the component material would otherwise require coding as part of a wider sc/np, then this material should be assigned its standard gig_pos, gig_dep, and gig_dep_on codes as normal. The only exceptions are as follows:

o Firstly, you should treat the subordinator as if it were the head of this material. As such, all elements that would normally be treated as dependents of the missing “be” should now have their gig_dep_on marked as linking back to this subordinator.

o In turn, where this subordinator is also part of an sc, then you should assign it the standard gig_pos code of conj_sub, assign it no gig_dep code, and mark its gig_dep_on code as linking to the relevant sc marker. Where it is not part of an sc, on the other hand, no coding is required for the subordinator itself.

o Where an np is present that would otherwise function as the predicative complement of the missing “be”, then this np should be assigned its normal gig_pos code, receive the special gig_dep code of pred_vless, and have its gig_dep_on code marked as linking to the subordinator that introduces the material.

For example,

Page 76: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  76  

§ Type II Verbless Clauses Type II verbless clauses cover cases of two units which have the semantics of a “subject” + “be” + “adverbial/predicative complement” clause, but which are not explicitly connected by the verb required to instantiate these dependencies. For example, “his house amock with Behaviorists, Chomsky was unable to finish his novel” and “with Skinner a fool Chomsky easily won the argument).

Where encountered, all of the elements that comprise this material should be assigned their standard gig_pos, gig_dep, and gig_dep_on codes as normal and as required according to the general rules set out in the rest of the manual. The only exceptions here concern the material which would otherwise function as direct dependents of the missing “be”. This material should be coded as follows:

o Firstly, you should assign all of this material its normal gig_pos coding.

o Secondly, where the material is introduced by a subordinator/preposition, then mark the gig_dep_on codes for the two elements that fulfil the subject + adverbial/predicative complement function as linking back to this subordinator/preposition. Thus, for example, “Skinner” and “fool” would both be coded as linking back to “with” in “With Skinner a fool, Chomsky easily won the argument”.

o Conversely, where it is not introduced by a subordinator/preposition, you should instead assign the gig_dep_on for each element as linking to the wider clause on which the verbless clause itself depends. Thus, for example, “Skinner” and “fool” would both be coded as linking back to “won” in “Skinner a fool, Chomsky easily won the argument”.

o Thirdly, where a subordinator/preposition for the verbless clause is present as part of a wider sc, then assign the subordinator/preposition its standard gig_pos code, assign it no gig_dep code, and mark its gig_dep_on as linking to the wider sc on which the verbless clause as a whole depends. Thus, for example, “with” in “I laughed because I saw how, with his analysis in ruins, Chomsky easily won the argument” would be coded as linking back to “won”.

word # word status gig_pos gig_dep gig_dep_on

1 Although 2 a det 4

3 great adj premod 4

4 linguist np noun_com pred_vless 1

5 Chomsky’s np noun_prop det 7

6 latest adj premod 7

7 musings np noun_com subj 8

8 are

10 baffling

Page 77: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  77  

o Fourthly, where this material constitutes a noun phrase that would otherwise function as the subject of the missing “be”, you should assign its np marker the special gig_dep code of subj_vless.

o Fifthly, where this material constitutes a noun phrase that would otherwise function as the adverbial of the missing “be”, you should assign its np marker the special gig_dep codes of adv_vless.

o Finally, where this material constitutes a noun phrase that would otherwise function as a predicative complement of the missing “be”, you should assign its np marker the special gig_dep codes of pred_vless.

For example,

word # word status gig_pos gig_dep gig_dep_on

1 His det 2

2 analysis np noun_com subj_vless 6

3 in

4 ruins np noun_com prepobj 3

5 Skinner np noun_prop subj 6

6 wept

Page 78: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  78  

Appendix I: Gig_pos Code List

Adjectives

o adj : Generic Adjective (e.g. I read his brilliant analysis; He was ecstatic)

o part : Participial form acting as premodifier, whether adjectival or verbal (e.g. his broken bike; the approaching train)

Adverbs

o adv : Adverb (e.g. Honestly, that time when Skinner cried just made me very angry)

o adv_coord : Coordinative “Not” (e.g. He wanted a behaviourist, not a linguist)

Conjunctions

o conj_coord : Coordinating conjunction, including the correlative element in cases of correlative coordination (e.g. Chomsky loves neither cats nor dogs; Skinner love cats and dogs)

o conj_sub : Subordinating conjunction (e.g. He was disqualified because he tried to cheat in case he wasn’t quick enough)

Determiners

o det: Determiner (e.g. This Chomsky destroys any linguist that loves a Behaviorist)

Inserts

o int : Interjection (e.g. I heard him SHOUT ah amazing!”)

Nouns

o noun_com : Common noun (e.g. Linguists love grammar)

o noun_prop : Proper noun (e.g. John loves Noam Chomsky)

Numbers

o num : Cardinal/Ordinal number (e.g. We need five-hundred-and-fifty lemon)

Particles

o prt : Particle (e.g. I looked up the answer)

Prepositions

o prep : Preposition (e.g. I was in the garden; I was chased by a cat because of you)

Pronouns

Page 79: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  79  

o pro : Generic pronoun (e.g. These are the linguists that love each other and both of you)

o pro_dum : Special, non-referential uses of “It” + “There” as detailed above in the three sections titled Clefts, Existential Clauses, and Extraposed Clauses (e.g. It is Skinner who Chomsky hated; There is no easy solution; “It was hard to beat Chomsky)

Verbs

o part : Participial form acting as premodifier, whether adjectival or verbal (e.g. his broken bike; the approaching train)

o verb_aux : Auxiliary verb, including modals (e.g. I was wondering what you would think)

o verb_lex_act : Lexical verb marking active voice (e.g. Thieves stole your bike)

o verb_lex_pass : Lexical verb marking passive voice (e.g. Your bike was stolen by thieves)

Secondary Gig_pos Classifiers (to be appended to the above, as appropriate and in the following order)

o _rel : Identifies the relative word that introduces a relative clause (e.g. I remember the time when Chomsky gave every student who studies at MIT a beautiful flower)

o _complex : Identifies the element(s) of a complex word (e.g. “a FEW”, “because OF”, “as long AS”, “The Royal SOCIETY for the Protection of Birds”, “yard ARM”, “living ROOM”, “home MADE”), and marginal auxiliaries/semi-modals (“HAVE to”).

o _gen : identifies a genitival common or proper noun, whether simple or complex (e.g. “the cat’s whiskers”, “Skinner’s mistake”, “Noam Chomsky’s greatest achievement”).

o _agent : Identifies the head of an np that combines with the preposition “by” to form a passive by-phrase, as described in the section above (e.g. Chomsky was defeated by him; Skinner was slain by Professor Chomsky”)

Page 80: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  80  

Appendix II: Gig_dep Code List

In the examples below, the italicize portion identifies the part-of-speech that exhibits the tendency, the underlined portion identifies the unit in which the part-of-speech appears, and the capitalised word indicates the word on which the part-of-speech depends.

Thus, in the following example, the noun_com “man” appears in the noun phrase “the man” and manifests the dependency of subj with respect to the main verb “WAS”:

“The man WAS happy”

NP External Dependencies

Adjective Modifier

o adjmod : np that depends on an adjective (e.g. The book was three inches LONG)

Adverbial

o adv : np functioning as an adverbial (e.g. We FOUGHT this morning)

Adverb Modifier

o advmod : np that depends on an adverb (e.g. I handed it in a few days LATE)

Determiner

o det : Genitival np that depends on another np (e.g. The great linguist’s ground-breaking BOOK)

Direct Speech NP

o speech : np functioning as an instance of direct speech (e.g. He shouted CHOMSKY! )

Dislocated NP

o disl : np functioning as a piece of dislocation (e.g. This grammar book, I HATE it)

“Independent” NP

o head : Any np that cannot be treated as dependent on another linguistic unit, making it effectively a sentence unto itself; the stereotypical case here being section headings (e.g. CONCLUSION)

NP Modifiers

o premod : np that comes between another np & its determiners/numerals, and which is directly dependent on this noun phrase (e.g. Oh yes I love that Chomsky BOOK)

o postmod : np that follows and is directly dependent on another np, but without being in apposition (e.g. I turned and saw the CHOMSKY the great)

o postmod_app : np that follows and which is appositionally dependent on another np (e.g. I turned and saw the CHOMSKY, the greatest linguist of them all)

Page 81: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  81  

Preposition Modifier

o prepmod : np that modifies a preposition phrase (e.g. I found him a whole mile DOWN the road)

Prepositional Object

o prepobj : np that combines with a preposition to form a preposition phrase (e.g. I discovered Chomsky ON this day)

Verb Arguments

o sc_np : np that comes between the matrix verb and a non-finite complement clause (e.g. I want Chomsky to SUCCEED)

o dobj : np as direct object a wider clause (e.g. I GAVE him a book)

o iobj : np as indirect object (e.g. I GAVE him a book)

o foc : np as the focused element of a cleft clause (e.g. It IS Chomsky we love; What Chomsky hates IS bad grammar)

o predsubj : np as subject predicative (e.g. He WAS a grammarian)

o predobj : np as object predicative (e.g. I CONSIDER him a grammarian)

o pred_vless : np as the predicative argument of a verbless clause (e.g. “Skinner WEPT his analysis the most ridiculous piece of analysis ever proposed)

o subj : np as subject (e.g. The lazy grammarian MOCKED Chomsky’s work)

o subj_inv : np as subject in a declarative subject-inversion structure (e.g. Here WAS Chomsky)

o subj_not : np as notional subject of existential clause (e.g. There WAS one solution)

o subj_vless : np as the subject of a verbless clause (e.g. “Skinner WEPT, his analysis in ruins”).

Vocative NP

o voc : np functioning as a vocative (e.g. IS that you, Noam?)

NP Internal Dependencies (Phrasal only)

o premod : Any element which comes between an np & its determiners/numerals, and which is directly dependent on this np (e.g. I saw the three lazy GRAMMARIAN from Paris)

o postmod : Any element which is directly dependent on a preceding np (e.g. I saw the lazy GRAMMARIAN from Paris)

Page 82: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  82  

Preposition Phrases

o sc_pp : Preposition phrase acting as the subject of an infinitival subordinate clause (e.g. The chance for us to DO something special)

SC Dependencies

Primary SC Classifiers

o adjmod : Any sc that is dependent on a wider adjective, whether as a complement clause (e.g. I am ABLE to do it) or a comparative clause (e.g. “Chomsky is as BEAUTIFUL as you can imagine”).

o adv : any sc functioning as an adverbial, including any clauses classified by LGSWE as a comment clause or a tag clause (e.g. I READ it because I hate it; I READ it to impress you; He SHOUTED vociferously, disrupting the conversation; It WAS, you know, not his best idea; You will MARRY me, won’t you?)

o adv_app : any sc functioning in an appositional relationship to the verb of a preceding clause (e.g. Chomsky wanted to WIN - that is, to vanquish every functionalist that crossed his path)

o advmod : Any sc functioning as a comparative clause that is dependent on a wider adverb according to the LGSWE (e.g. I ran as FAST as I could)

o disl : Any sc functioning as a dislocation structure (e.g. reading Chomsky - THAT is my idea of heaven)

o extr : Any sc functioning as the extraposed complement of a preceding verb or adjective (e.g. IT is always wonderful to read Chomsky again)

o foc : Any sc functioning as the focused element of a cleft clause (e.g. It IS because we love Chomsky that you matter; What Chomsky did WAS rock my world)

o head : Any sc that cannot be treated as directly dependent on another element, making it effectively a sentence unto itself

o obj : Any sc that is not functioning as the focused element of an it-cleft clause but which is functioning as the object or predicative complement of a surrounding verb (e.g. I THINK grammar is great; It SEEMS we have no choice; I TOLD him to go home; I CONSIDER him to be the best of the best)

o premod : Any sc which both comes between an np & its determiners/numerals, and which is directly dependent on this np (e.g. I love those I told you so MOMENTS)

o prepobj : Any sc functioning as the dependent of a preposition (e.g. “Chomsky wins BY decimating his opponents”; “I’ve never even thought ABOUT Chomsky”)

o postmod : Any sc that is dependent on a preceding np but which is not functioning appositionally (e.g. the FACT that you read it; the GRAMMARIAN who reads)

Page 83: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  83  

o postmod_app : Any sc that is appositionally dependent on a preceding np (e.g. Academia is many THINGS: running grammar scams, stealing money from phonologists)

o speech : Any sc functioning as a piece of direct speech according to the conventions set out above (e.g. “that Chomsky should ever hate me,” I shouted).

o subj : Any sc in subject function (e.g. reading grammar books MAKES me sad ; that you like it MAKES me sad)

Secondary SC Classifiers (Finititude)

o _fin : Any finite sc (e.g. I KNOW that Chomsky loves me)

o _nfin : Any non-finite sc (e.g. I WANT to make Chomsky happy)

Tertiary SC Classifiers (Major Structural Types)

o _cleft : The relative clause-like sc that marks a cleft structure (e.g. It IS Chomsky that we love; What Chomsky did WAS rock my world)

o _comment : Any sc functioning as a comment clause (e.g. We could DO that, I think)

o _comp : Any sc functioning as a comparative clause (e.g. They are as GOOD as you can get; They ran as FAST as they could)

o _rel : Any relative clause in whatever function (e.g. Any GRAMMARIAN who reads Chomsky cannot fail; Chomsky LAUGHED, which was strange)

o _tag : Any sc explicitly classified by LGSWE as a tag clause (e.g. I PARSED it didn’t I?)

General Secondary Classifiers (to be applied in this order, after all other codes have been marked)

o _fronted : Marks the head of a piece of fronted material, as specified in the relevant section above (e.g. when you’ve read it we can TALK)

o _ellipsis : Marks a piece of ellipted material, as specified in the relevant section above (e.g. I thought you LOVED him and I didn’t)

o _gap : Marks a piece of gapped material, as specified in the relevant section above (e.g. I though James BOUGHT pens and Jim paper?)

o _gram : Marks a piece of ungrammatical material, as specified in the relevant section above (e.g. I THINK Chomsky are revolting).

Page 84: Growth in Grammar Project Annotation Manual

Growth in Grammar Project - Annotation Manual

  84  

References

Primary References

Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999) Longman Grammar of Spoken and Written English. London: Longman.

Rundell, M. (2007) MacMillan English Dictionary for Advanced Learners (2nd edition). Oxford: Macmillan Education.

Available at http://www.macmillandictionary.com/

OED Online. Oxford University Press. Available at http://www.oed.com/

Secondary References

Huddleston, R. & Pullum, G. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press.

Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985) A Comprehensive Grammar of the English Language. London: Longman.

Stanford CoreNLP Software Suite. Available at https://stanfordnlp.github.io/CoreNLP/

Wikipedia. Available at https://en.wikipedia.org/wiki/Main_Page