Upload
trinhtram
View
216
Download
2
Embed Size (px)
Citation preview
.
UNIVERSITY OF CALIFORNIASanta Barbara
Computing Representations of the Structure of Written Discourse
A Dissertation submitted in partial satisfactionof the requirements for the degree of
Doctor of Philosophy
in
Linguistics
by
Simon Henderson Corston-Oliver
Committee in charge:
Professor Susanna Cumming, Chairperson
Dr. William Dolan
Professor Carol Genetti
Professor Sandra Thompson
March 1998
.
The dissertation of Simon Corston-Oliver is approved
Committee Chairperson
June 12, 1998
ii
31 March 1998
Copyright by
Simon Corston-Oliver
1998
iii
To Mo
iv
Simon Henderson Corston-Oliver
Curriculum vitae
6 May, 2023
Date and place of birth
22 March 1969, Christchurch, New Zealand.
Education
1998 Doctor of Philosophy, Linguistics, University of California, Santa Barbara, U.S.A. Thesis: “Computing Representations of the Structure of Written Discourse.”
1993-1994 Education abroad. Study of Mandarin Chinese at Beijing Daxue (Peking University), People’s Republic of China.
1993 Master of Arts with First Class Honors, Linguistics, Auckland University, New Zealand. Thesis: ‘Ergativity in Roviana.’
1991 Bachelor of Arts, Linguistics, Auckland University, New Zealand.
v
Awards
1994-1996 Special Regents’ Fellowship, University of California, Santa Barbara, U.S.A.
1994 Junior Fellowship, Interdisciplinary Humanities Center, University of California, Santa Barbara, U.S.A.
1993 New Zealand - China Exchange Program Scholarship for study at Beijing Daxue (Peking University), People's Republic of China. Ministry of External Relations and Trade, New Zealand Government.
1993 Auckland University Graduate Scholarship, Auckland University, New Zealand.
1993 Department research grant, Department of Linguistics, Auckland University, New Zealand, for research on Roviana.
1992,1993 Auckland University Graduate Scholarship, Auckland University, New Zealand.
1992 Senior Scholar in Linguistics, Auckland University, New Zealand.
1990 Annual Prize in Linguistics, Auckland University, New Zealand.
Publications
Corston, Simon H. 1993. ‘On the interactive nature of spontaneous oral
narrative.’ Te Reo 36:69-97.
Corston, Simon H. 1993. Ergativity in Roviana. M.A. Thesis. Auckland
vi
University, New Zealand.
Corston, Simon H. 1996. Ergativity in Roviana, Solomon Islands. Pacific
Linguistics, Series B-113. Australia National University Press:
Canberra.
Corston-Oliver, Simon H. To appear. ‘Beyond string matching and cue
phrases: Improving efficiency and coverage in discourse analysis.’
Proceedings of the AAAI Spring Symposium on Intelligent Text
Summarization, March 23-25, 1998.
Corston-Oliver, Simon H. To appear. ‘Roviana.’ In Crowley, Terry, John
Lynch and Malcolm Ross (eds.) Oceanic Languages. Edinburgh:
Edinburgh University Press.
Corston-Oliver, Simon H. To appear. ‘The marking of core arguments and the
inversion of the Nominal Hierarchy in Roviana’. In Proceedings of the
Conference on Preferred Argument Structure: The Next Generation.
Kumpf, Lorraine E. and John W. Dubois (eds.)
vii
Selected professional experience
1996- Computational linguist, Microsoft Research, Redmond, WA, U.S.A.
1995-1996 Computer laboratory technician, Linguistics Department, University of California, Santa Barbara, U.S.A.
1994-1995 Computer programmer, Corpus of Spoken American English, University of California, Santa Barbara, U.S.A.
1990-1994 Computer programmer, Lockie Computing, Auckland, New Zealand.
1993 Teaching assistant to Professor Frank Lichtenberk, Department of Linguistics, Auckland University, Auckland, New Zealand.
viii
ABSTRACT
Computing Representations of the Structure of Written Discourse
Simon Corston-Oliver
RASTA (Rhetorical Structure Theory Analyzer), a discourse analysis
component within the Microsoft English Grammar, efficiently computes
representations of the structure of written discourse using cue phrases and
additional information available in syntactic and logical form analyses of a text.
RASTA heuristically scores the rhetorical relations that it hypothesizes, using
those scores to guide it in producing more plausible discourse representations
before less plausible ones. The heuristic scores also provide a genre-
independent method for evaluating competing discourse analyses: the best
discourse analyses are those constructed from the strongest hypotheses.
This dissertation describes in detail a set of linguistic cues that can be
identified in a text as evidence of discourse relations, and gives complete and
explicit algorithms for identifying the terminal nodes of a discourse analysis
and for efficiently combining those terminal nodes to form hierarchical
representations of discourse structure.
ix
TABLE OF CONTENTS
1. Introduction............................................................................................. 1
2. Data.......................................................................................................... 6
3. Rhetorical Structure Theory...................................................................93.1 Introduction........................................................................................... 93.2 Overview............................................................................................... 93.3 Conditions on the structure of trees.....................................................153.4 Formalizing the relations.....................................................................153.5 The set of relations..............................................................................183.6 Underspecified Rhetorical Structure Theory........................................263.7 Schemas.............................................................................................. 333.8 Conclusion........................................................................................... 37
4. Previous Work on Computing Discourse Representations.................394.1 Introduction......................................................................................... 394.2 Rhetorical Structure Theory.................................................................39
4.2.1 Mann and Thompson (1986, 1988)..............................................394.2.2 Sumita et al. (1992) and Ono et al. (1994)...................................404.2.3 Kurohashi and Nagao (1994).......................................................424.2.4 Fukumoto and Tsujii (1994)........................................................424.2.5 Wu and Lytinen (1990)................................................................434.2.6 Marcu (1996, 1997a)...................................................................44
4.3 PISA..................................................................................................... 524.4 Hobbs (1979)....................................................................................... 564.5 The Linguistic Discourse Model (LDM)..............................................584.6 Conclusion........................................................................................... 59
5. The Microsoft English Grammar.........................................................625.1 Introduction......................................................................................... 62
x
5.2 Lexicon...............................................................................................645.3 Syntax.................................................................................................64
5.3.1 Sketch..........................................................................................645.3.2 Portrait......................................................................................... 68
5.4 Logical Form....................................................................................... 705.5 Word Sense Disambiguation................................................................745.6 Discourse............................................................................................. 755.7 MINDNET.............................................................................................755.8 Conclusion........................................................................................... 76
6. Cues to Discourse Structure..................................................................776.1 Introduction......................................................................................... 776.2 Correlations between clausal status and rhetorical status.....................786.3 The role of anaphora, deixis and referential continuity........................806.4 Heuristic scores...................................................................................826.5 Necessary criteria and cues..................................................................836.6 Dependence on a set of relations..........................................................846.7 Cues to the relations............................................................................85
6.7.1 ASYMMETRICCONTRAST..............................................................866.7.2 CAUSE..........................................................................................946.7.3 CIRCUMSTANCE..........................................................................1056.7.4 CONCESSION...............................................................................1106.7.5 CONDITION.................................................................................1156.7.6 CONTRAST..................................................................................1186.7.7 ELABORATION............................................................................1276.7.8 JOINT.......................................................................................... 1326.7.9 LIST...........................................................................................1366.7.10 MEANS.......................................................................................1426.7.11 PURPOSE....................................................................................1436.7.12 RESULT......................................................................................1456.7.13 SEQUENCE..................................................................................153
7. Constructing Trees..............................................................................1707.1 Introduction.......................................................................................1707.2 The need for an improved algorithm..................................................1707.3 Identify terminal nodes......................................................................172
xi
7.4 Posit hypotheses................................................................................1757.5 Construct trees...................................................................................178
7.5.1 Promotion sets...........................................................................1787.5.2 Group mutually exclusive hypotheses........................................1837.5.3 Produce and rank binary-branching trees...................................1857.5.4 Produce n-ary branching trees....................................................1927.5.5 Learning the heuristic scores......................................................198
7.6 Worked example................................................................................203
8. RASTA’s contributions to the field......................................................2218.1 Introduction.......................................................................................2218.2 Identifying rhetorical relations...........................................................2218.3 Representations of knowledge...........................................................2258.4 Constructing and evaluating trees......................................................2288.5 Genre................................................................................................. 229
9. Potential Applications for RASTA.......................................................2319.1 Introduction.......................................................................................2319.2 Text summarization...........................................................................2319.3 The creation of semantic networks.....................................................2379.4 Information retrieval..........................................................................2389.5 Quantitative analysis of discourse patterns.........................................239
10. Conclusion............................................................................................ 241
xii
TABLE OF FIGURES
Figure 1 Waterloo, Battle of............................................................................11
Figure 2 Pseudepigrapha.................................................................................12
Figure 3 Trafalgar, Battle of............................................................................14
Figure 4 Definition of the VOLITIONAL CAUSE relation...................................17
Figure 5 Taxonomy of discourse relations.......................................................29
Figure 6 Echidna............................................................................................. 32
Figure 7 RST schemas (Mann and Thompson 1988:247).................................34
Figure 8 Alternative discourse structures.........................................................36
Figure 9 Prince Edward Island........................................................................49
Figure 10 Alternative formulations of the same propositional content.............57
Figure 11 Syntactic sketch produced by MEG..................................................66
Figure 12 Underlying data structure for the sketch..........................................67
Figure 13 Syntactic portrait produced by MEG................................................69
Figure 14 Logical form produced by MEG......................................................70
Figure 15 Labels used in the logical form.......................................................72
Figure 16 Resolution of reflexive pronoun......................................................73
Figure 17 Resolution of personal pronoun.......................................................73
Figure 18 Data structure underlying the node drive1.......................................74
Figure 19 Echidna...........................................................................................80
xiii
Figure 20 The Subordinate Clause Condition..................................................86
Figure 21 Necessary criteria for the ASYMMETRICCONTRAST relation.............89
Figure 22 Cue to the ASYMMETRICCONTRAST relation....................................90
Figure 23 Aardwolf.........................................................................................91
Figure 24 Bossuet, Jacques Bénigne................................................................92
Figure 25 Argon.............................................................................................. 92
Figure 26 Textiles...........................................................................................93
Figure 27 Cues to the CAUSE relation..............................................................95
Figure 28 Syrdarya........................................................................................96
Figure 29 Pregnancy and childbirth.................................................................98
Figure 30 Necessary criteria for the CAUSE relation when the Subordinate Clause Condition is not satisfied..............................................................99
Figure 31 Cues to the CAUSE relation............................................................102
Figure 32 Species and speciation...................................................................103
Figure 33 Segregation in the United States....................................................104
Figure 34 Soil management...........................................................................105
Figure 35 Cues to the CIRCUMSTANCE relation..............................................106
Figure 36 Abiathar........................................................................................107
Figure 37 Africa............................................................................................ 108
Figure 38 Trafalgar, Battle of........................................................................108
Figure 39 Acuff, Roy....................................................................................109xiv
Figure 40 Cue to the CONCESSION relation....................................................110
Figure 41 Renaissance Art and Literature......................................................111
Figure 42 Aardvark.......................................................................................112
Figure 43 Adventists.....................................................................................112
Figure 44 Abolitionists..................................................................................113
Figure 45 Cue to the CONDITION relation......................................................115
Figure 46 Prince Edward Island....................................................................116
Figure 47 Prince Edward Island: Syntactic analysis......................................117
Figure 48 Pregnancy and Childbirth..............................................................118
Figure 49 Necessary criteria for the CONTRAST relation work-around...........119
Figure 50 Cue to the CONTRAST relation work-around..................................120
Figure 51 Textiles.........................................................................................120
Figure 52 Necessary criteria for the CONTRAST relation................................121
Figure 53 Cues for the CONTRAST relation....................................................122
Figure 54 Abbess..........................................................................................123
Figure 55 Primus, Pearl.................................................................................124
Figure 56 Aardwolf.......................................................................................126
Figure 57 Abrasives......................................................................................127
Figure 58 Necessary criteria for the ELABORATION relation..........................128
Figure 59 Cues to the ELABORATION relation................................................129
xv
Figure 60 Aardwolf.......................................................................................131
Figure 61 Stem.............................................................................................. 132
Figure 62 Necessary criteria for the JOINT relation........................................133
Figure 63 Religion........................................................................................134
Figure 64 Pregnancy and childbirth...............................................................135
Figure 65 Necessary criteria for the LIST relation..........................................137
Figure 66 Cues to the LIST relation................................................................139
Figure 67 Psychotherapy...............................................................................140
Figure 68 Echidna.........................................................................................142
Figure 69 Cue to the MEANS relation.............................................................143
Figure 70 Pre-Columbian Art and Architecture.............................................143
Figure 71 Cues to the PURPOSE relation.........................................................144
Figure 72 Ransome, Arthur Michell..............................................................145
Figure 73 Cues to the RESULT relation..........................................................147
Figure 74 Misparse of a detached participial clause.......................................148
Figure 75 Waterloo, Battle of........................................................................148
Figure 76 Ramsey, Norman Foster................................................................149
Figure 77 God............................................................................................... 149
Figure 78 Necessary criteria for the RESULT relation when the Subordinate Clause Condition is not satisfied............................................................150
xvi
Figure 79 Cues for the RESULT relation when the Subordinate Clause Condition is not satisfied........................................................................................ 151
Figure 80 Speech and Speech Disorders........................................................152
Figure 81 Propane.........................................................................................153
Figure 82 Necessary criteria for the SEQUENCE relation................................155
Figure 83 Logical form illustrating negative polarity....................................158
Figure 84 Acquired Immune Deficiency Syndrome.......................................158
Figure 85 Moissan, Ferdinand-Frederic-Henri...............................................159
Figure 86 Abacha, Sani.................................................................................160
Figure 87 Cues for the SEQUENCE relation....................................................162
Figure 88 Waterloo, Battle of........................................................................163
Figure 89 World War II.................................................................................167
Figure 90 Compare dates...............................................................................168
Figure 91 Waterloo, Battle of........................................................................169
Figure 92 Criteria for an RST terminal node..................................................174
Figure 93 Aardvark.......................................................................................174
Figure 94 Data structure of a hypothesized symmetrical rhetorical relation...177
Figure 95 Data structure of a hypothesized asymmetric rhetorical relation....178
Figure 96 Binary-branching tree for Abd-ar-Rahman excerpt.......................181
Figure 97 Binary-branching tree for Aardwolf excerpt..................................182
Figure 98 Data structure of an underspecified asymmetric rhetorical relation184xvii
Figure 99 Pseudo-code for constructing RST trees.........................................188
Figure 100 Corresponding binary and n-ary branching symmetric RST trees.194
Figure 101 Corresponding binary and n-ary branching asymmetric RST trees.............................................................................................................. 196
Figure 102 Corresponding binary and n-ary branching complex RST trees....197
Figure 103 Pseudo-code for the function BinaryToNaryTree........................198
Figure 104 Rankings of RST trees..................................................................201
Figure 105 Aardwolf.....................................................................................204
Figure 106 Analysis of the first sentence.......................................................204
Figure 107 Analysis of the second sentence..................................................205
Figure 108 Analysis of the third sentence......................................................206
Figure 109 Analysis of the fourth sentence....................................................207
Figure 110 Bags for the excerpt....................................................................209
Figure 111 Hypothesized relations for the excerpt.........................................210
Figure 112 Terminal nodes and initial projections.........................................211
Figure 113 Contents of RSTNODES after applying hypothesis 4....................212
Figure 114 Contents of RSTNODES after applying hypothesis 6....................214
Figure 115 Contents of RSTNODES after applying hypothesis 1....................215
Figure 116 Contents of RSTNODES after further processing..........................217
Figure 117 First complete RST tree for Aardwolf excerpt..............................219
Figure 118 Aardwolf.....................................................................................227xviii
Figure 119 Hypertext view of Abd-ar-Rahman text.......................................234
Figure 120 Conjunctivitis..............................................................................235
Figure 121 Hypertext view of conjunctivitis text...........................................236
Figure 122 Hypertext view of conjunctivitis text...........................................237
xix
1. Introduction
This dissertation describes a system for computing representations of the
structure of written discourse. This system, RASTA (Rhetorical Structure Theory
Analyzer), takes as its input a written representation of a text and produces as its
output a representation of the structure of that text in the form of an n-ary
branching tree of the kind used within Rhetorical Structure Theory (henceforth
RST) (Mann and Thompson 1986, 1988).
Sanders and van Wijk (1996:91) note that “Existing models for text
structure analysis tend to rely heavily on analysts’ intuitions and world
knowledge, and they are hardly formulated explicitly enough to be applied in an
objective and reliable way”. Computers are of course notorious for their lack of
linguistic intuition. If a computer is to identify discourse structure, we therefore
require maximally explicit algorithms. These algorithms ought also to be efficient
if a computational discourse analyzer is to have any utility. This dissertation
therefore addresses two distinct problems:
1
1. How can we automatically identify linguistic cues to discourse
structure?
2. How can a discourse module efficiently construct plausible
representations of discourse structure on the basis of those cues?
Due to the emphasis on natural language generation in computational work on
RST, neither of these issues has received much attention in the field of
computational linguistics.
It has been widely assumed, or even asserted, that reasoning beyond
textual form is needed to compute a representation of the structure of a text (see
sections 4 and 8.2). In contrast, the development of RASTA has been guided by a
functionalist approach to analyzing language. Writers employ linguistic resources
—morphology, the lexicon, syntax—to realize their communicative goals. In
employing these linguistic resources, a text is molded, taking on a specific form
from which it is possible to infer the writer’s communicative goals. The first
stage of RASTA’s operation addresses the first problem, “How can we
automatically identify linguistic cues to discourse structure?”, from this
functionalist perspective. RASTA examines the syntactic analysis and logical form
analysis of a text, considering such cues to discourse structure as cue phrases,
2
tense, aspect, polarity and referential continuity of noun phrases. On the basis of
these cues, RASTA posits discourse relations between clauses, associating a
heuristic score with each relation that reflects a relative confidence in the
plausibility of the discourse relation. When RASTA has finished hypothesizing
discourse relations, it commences the second stage of its analysis, a stage that
addresses the second problem, “How can a discourse module efficiently construct
plausible representations of discourse structure on the basis of those cues?”
During the second stage, RASTA assembles well-formed RST trees that are
compatible with the posited discourse relations. RASTA applies the posited
discourse relations with high heuristic scores before those with lower heuristic
scores in a bottom-up manner, grouping contiguous clauses into a hierarchical
representation. Because RASTA is guided by the heuristic scores, it rapidly
converges on the best discourse analyses for a text.
Human readers, no doubt, employ knowledge outside of a text to aid in its
interpretation, drawing on such factors as world knowledge, genre conventions
and plausible inferences. Rather than attempting to model such extrinsic
knowledge and thereby mimic the current understanding of the mental processes
of human readers, RASTA proceeds under the assumption that the text itself
3
contains sufficient clues to enable a computer to compute a feasible
representation of its discourse structure, and therefore posits discourse relations
solely on the basis of a linguistic analysis of the text. Although I do not wish to
draw unwanted mentalist inferences on the basis of what is computationally
feasible, it would not be surprising if it were to turn out that some of the
superficial cues to discourse structure employed by RASTA were also employed
by human readers. The psychological reality of the cues employed by RASTA is,
however, a matter for separate experimental investigation.
Despite the emphasis on computational considerations in this dissertation,
I hope that the results of this study will be accessible and of interest to researchers
in discourse who do not possess a computational bent.
Representations of the structure of a text are by no means an end in
themselves. Rather, such representations are expected to prove useful for future
work on information retrieval and on the automatic acquisition of knowledge.
Moreover, a reliable automated means of identifying discourse structure opens the
way to large-scale empirical analyses of discourse. For example, it would be
possible to consider, in a given genre, what linguistic devices are most commonly
used to realize particular textual relations.
4
This dissertation has the following structure. Following a brief description
of the data analyzed (chapter 2), it proceeds to a description of RST and the
modifications made to the standard theory to fit it to the task at hand (chapter 3),
then to a survey of previous work on computing representations of discourse
structure (chapter 4). Chapter 5 presents a brief overview of the Microsoft
English Grammar (MEG), within which RASTA is a component. Chapters 6 and 7
describe how RASTA identifies cues to discourse structure using the resources
available within MEG and then constructs n-ary branching RST trees on the basis
of the cues that it identifies. Chapter 8 brings together the algorithms described in
chapters 6 and 7 and clarifies RASTA’s contribution to the field of computational
discourse processing. Finally chapters 9 and 10 conclude the dissertation, pointing
to future research directions.
5
2. Data
For the present study, the data is limited to the text of the articles in
Encarta 96 (Microsoft Corporation 1995, henceforth simply Encarta), an
electronic multimedia encyclopedia of broad coverage, aimed at a general non-
specialist audience. These articles form a corpus of a little over ten million words,
in approximately 576,000 sentences.
Part of the ongoing research in the Microsoft Natural Language
Processing Research Group concerns the acquisition of knowledge from natural
language texts. An extensive semantic network, MINDNET (section 5.7),
consisting of 120,000 head words and approximately seven million labeled arcs
connecting lexical senses of those head words has been constructed automatically
by parsing dictionary definitions (Dolan 1995; Dolan et al. 1993; Richardson
1997; Richardson et al. 1993; Vanderwende 1995a, 1995b). These dictionary
definitions are typically noun phrases or single clauses. While the same
techniques used to acquire information from dictionary definitions could
reasonably be applied to individual sentences in free text, knowledge of text
structure is sure to improve the extraction of information.
6
The content of Encarta is of a high caliber, with many articles having been
contributed by recognized authorities in their fields. The content is therefore
worth acquiring into a semantic network. Furthermore, the content is non-
controversial. The articles tend to represent views and information whose
interpretation is widely accepted. This is an advantage for the task of
automatically acquiring knowledge, since the philosophically (and
computationally) difficult task of integrating and resolving conflicting
information can be avoided.
The text in Encarta has several pragmatic advantages from the point of
view of computational analysis. The most important advantage is that the articles
are well-edited: sentences are generally free of spelling or grammatical errors,
since an in-house style guide is used to ensure a high degree of consistency in
punctuation, lexical usage, and syntax. Although a computational system for
broad-coverage analysis (such as the Microsoft English Grammar described in
section 5) ought to be able to cope with occasional errors in text, a computational
syntactic analysis can be expected to achieve a high degree of accuracy if the text
is edited.
7
Although the articles are written to conform to an in-house style guide, the
discourse structure of the text exhibits great variety. The diversity of the
discourse structure has many causes. Many authors outside of the editorial team
(often specialists in a given field) have contributed articles—the article on
Language, for example, was contributed by Bernard Comrie and the article on
Native American Languages was contributed by Lyle Campbell. The diverse
subject matter of the articles in Encarta also motivates the diverse discourse
structure. For example, there are descriptions of physical objects, accounts of
historical battles, and explanations of religious views. Even within a single
article, however, there can be considerable complexity in the discourse structure.
Facts are not merely listed, they are presented in a coherent manner.
In conclusion, the text of Encarta, although edited to conform to a style
guide, exhibits great variety. The articles in Encarta are intended to be read by
non-specialists, and take the form of coherent texts. In section 8.5, I consider how
the research presented in this dissertation might need to be extended to apply to
other genres.
8
3. Rhetorical Structure Theory
3.1 Introduction
RST was developed during the 1980s by researchers in natural language
generation, many of whom were then involved with projects at the Information
Sciences Institute in Southern California. Since much of my research is informed
by the theoretical approach taken within RST, it is first necessary to outline the
theory, criticisms of it, and the modifications which I have made to adapt it to my
purposes.
3.2 Overview
RST (Mann and Thompson 1986, 1988) models the discourse structure of
a text by means of a hierarchical tree diagram. The terminal nodes of an RST tree
are propositions encoded in text. (Although RST analysts usually take care to
distinguish contiguous stretches of text, termed text spans, from the propositions
expressed in the text, in the discussion below I will simply refer to text spans.)
Non-terminal nodes represent contiguous text spans, whose daughter spans are
9
joined by discourse relations. These discourse relations are of two kinds:
symmetric and asymmetric.
A symmetric relation involves two or more text spans, each of which is
equally important in realizing the writer’s goals. By convention, each of these text
spans is labeled a nucleus. Figure 1 illustrates one kind of symmetric relation, the
Sequence relation.1 Straight lines are used to represent the connection between the
child nodes of a symmetric relation to their parent node.
1 Unless otherwise indicated, all examples in this dissertation are taken from Encarta.
10
1. Napoleon met defeat in 1814 by a coalition of major powers,
notably Prussia, Russia, Great Britain, and Austria.
2. Napoleon was then deposed
3. and exiled to the island of Elba
4. and Louis XVIII was made ruler of France.
Figure 1 Waterloo, Battle of
An asymmetric relation involves exactly two text spans. One text span, the
nucleus, is more important in realizing the writer’s goals. The other text span, the
satellite, is in a dependency relation to the nucleus, modifying it in ways specified
in the definition of the particular relation. Figure 2 illustrates one kind of
asymmetric relation, the ELABORATION relation. A labeled arc is used to represent
the connection between the satellite and the nucleus. The arrowhead on the arc
points to the nucleus.
11
1. In most cases, Pseudepigrapha are modeled on canonical
books of a particular genre.
2. For example, Judith is inspired by the historical books of the Old
Testament.
Figure 2 Pseudepigrapha
Although units as large as paragraphs, sections or chapters may be used as
terminal nodes for a coarse-grained analysis, the terminal nodes of an RST tree are
usually clauses with “independent functional integrity” (Mann and Thompson
1988:248). Restrictive relative clauses, which by definition serve to modify a
head noun and are therefore not directly in significant discourse relations to other
clauses, do not qualify as minimal textual spans under this criterion. (I have also
chosen to disregard non-restrictive relative clauses on the grounds that they also
12
serve to modify a head noun.) Similarly, clausal subjects and complements do not
qualify as terminal nodes.
A nucleus or a satellite may be a tree with internal complexity. RST thus
claims that the same structural representation can be used for the relationship
between two adjacent clauses, or for the relationship between any two arbitrarily
large text spans. Figure 3 illustrates a plausible RST tree to represent the structure
of a brief excerpt from Encarta 96. In Figure 3, a RESULT relation connects two
text spans—the span consisting of clauses 1 and 2 and the span consisting of
clauses 3 and 4—where each text span has internal structure, represented as an
RST subtree. (See Mann and Thompson (1988) for definitions of the relations
employed here.)
13
1. Nelson, however, surprised his adversary
1. by ordering his ships into two groups, each of which assaulted
and cut through the French fleet at right angles, demolishing the
battle line;2
2. this bold strategy created confusion,
2. giving the British fleet an advantage.
Figure 3 Trafalgar, Battle of
2 A few relative clauses in Encarta contain “mini-discourses”. In this example, there is
a SEQUENCE relation between assaulted and cut through the French fleet at right angles, with a
RESULT relation between this SEQUENCE and the clause demolishing the battle line. To avoid
excessive granularity, I do not construct RST analyses within relative clauses, although in
principle the same techniques could be used to construct representations for these mini-
discourses.
14
3.3 Conditions on the structure of trees
Four criteria determine the well-formedness of an RST tree (Mann and
Thompson 1988):
1. Completeness: a single tree covers the entire text.
2. Connectedness: each text span in the text, with the exception of the text span
which covers the entire text, is a node in the tree.
3. Uniqueness: text spans have a single parent.
4. Adjacency: only adjacent text spans can be grouped together to form larger
text spans.
3.4 Formalizing the relations
Four parameters are used in describing RST relations (Mann and
Thompson 1988:245):
15
1. Constraints on the nucleus
2. Constraints on the satellite
3. Constraints on the combination of nucleus and satellite
4. The effect
Figure 4 gives the definition of the VOLITIONAL CAUSE relation (Mann
and Thompson 1988:274-275). N stands for nucleus, S for Satellite, W for
Writer, and R for Reader.
16
Relation name: VOLITIONAL CAUSE
Constraints on N: presents a volitional action or else a situation that could
have arisen from a volitional action
Constraints on S: none
Constraints on the N+S combination:
S presents a situation that could have caused the agent of
the volitional action in N to perform that action;
without the presentation of S, R might not regard the action
as motivated or know the particular motivation;
N is more central to W’s purposes in putting forth the N-S
combination than is S.
The effect: R recognizes the situation presented in S as a cause for the
volitional action presented in N
Locus of the effect: N and S
Figure 4 Definition of the VOLITIONAL CAUSE relation
As the epistemic modal phrases could have and might not make clear, the
definition of an RST relation allows for subjective evaluation on the part of the
17
analyst. The analyst proposes a judgment of the plausibility of suggesting that the
writer intended a certain effect (Mann and Thompson 1988:245). This
subjectivity is mitigated by the fact that different analysts tend to agree in their
analyses, or at least to be able to see the validity of one another’s analyses (Mann
and Thompson 1988:265).
It is important to note that the description of an RST relation does not
include a description of the linguistic forms employed to realize the relation.
Indeed, Mann and Thompson (1986:68, 70-72) note that “relational propositions
arise in a text independently of any specific signals of their existence”. Mann and
Thompson (1986:71-72) even go so far as to suggest that a search for subtle
correlates of discourse relations is futile, a claim to which I return in section 4.2.
3.5 The set of relations
Although there is widespread acceptance by advocates of RST and
advocates of other theories of discourse (among them, Ballard et al. 1971; Grimes
1975; Halliday and Hasan 1976; Longacre 1976; Hobbs 1979) that relations of
the type proposed by RST are useful for describing the structure of discourse,
several questions arise:
18
1. How many relations are there?
2. How do we justify a particular set of relations?
3. How are the relations organized?
In answer to the question “How many relations are there?”, Hovy (1990)
identifies a total of approximately 350 relations which have been posited in the
linguistics, philosophy, and artificial intelligence literature. Within RST, for
example, Mann and Thompson (1986) propose fifteen relations, Mann and
Thompson (1988) propose twenty-three, and Fox (1987) proposes thirteen. It is
not the case however that Mann and Thompson (1988), in which twenty-three
relations are proposed, contains a superset of the relations in Mann and
Thompson (1986) or Fox (1987).
Hovy distinguishes a Parsimonious Position, advocated by Grosz and
Sidner (1986) in their work on Centering and Focusing, which posits two very
basic relations, Dominance and Satisfaction-Precedence. These two relations are
claimed to be sufficient for describing speaker intentions in discourse. Indeed,
Grosz and Sidner (1986) claim that it is futile to try to identify a larger finite set
19
of relations, since closer inspection always reveals increasingly subtle semantic
nuances.
Although two broad relations might be sufficient to describe the
intentional structure of discourse, they have been found to be insufficient for the
computational generation of natural language (McKeown 1985; Hovy 1988,
1990). This inadequacy motivates what Hovy labels the Profligate Position,
whose adherents claim that some tens of relations are needed to adequately
describe the structure of discourse.
One such profligate position is that advocated by Mann and Thompson
(1988), who suggest classifying some relations as primarily concerned with
subject matter and others as primarily presentational, although they decline to
impose a single taxonomy on the set of relations which they posit. Others,
however, have attempted to devise taxonomies of discourse relations. Hovy
(1990), for example, proposes a taxonomy which he claims subsumes the
approximately 350 discourse relations posited in the literature he surveys. Hovy’s
three-way top-level branches Elaboration, Enhancement and Extension are taken
from the expansion types of complex clauses within Systemic Functional
Grammar (Halliday 1985), a linguistic theory which has had considerable
20
influence on RST. Hovy (1990:133) cites the inclusiveness of cue words and
phrases as evidence of the correctness of the taxonomy. Cue words and phrases
associated with a node can be felicitously used in realizing relations occurring as
daughters of that node, but cue words and phrases cannot necessarily be
felicitously used with relations occurring as sister or parent nodes. For example,
the conjunction then is associated with the SEQUENCE relation, and can be used
for its daughter relations, as examples (1) and (2) (Hovy 1990:133) show. (The
grammaticality judgments given below are Hovy’s.)
(1) SEQTEMPORAL: First you play the long note, then the short ones.
(2) SEQSPATIAL: On the blue wall I have a red picture, then a blue
one.
The cue words after and beside, however, are limited to the
SEQTEMPORAL and SEQSPATIAL relations respectively, as examples (3) and (4)
(Hovy 1990:133) show.
(3) SEQTEMPORAL: After/*Beside the long note you play the short
ones.
(4) SEQSPATIAL: Beside/*After the red picture is the blue one.
21
Maier and Hovy (1991) reject the taxonomy of Hovy (1990) on the
grounds that it fails to recognize the communicative differences between the
various relations. Instead, they propose a three-way top-level distinction based on
the three meta-functions of language within Systemic Functional Grammar:
Ideational: reflecting facts about the world
Interpersonal: involving the reader’s attitudes towards the propositional
content
Textual: purely for presentational purposes
Maier and Hovy’s taxonomy is an elaboration of the subject matter versus
presentational distinction in Mann and Thompson (1988), with Ideational
corresponding to Mann and Thompson’s subject matter relations and Textual
corresponding to Mann and Thompson’s presentational relations.
Like Maier and Hovy (1991), Wu and Lytinen (1990) propose a three-way
classification of RST relations for persuasive texts such as advertisements. They
classify the relations according to a semantic analysis into three speech actions:
clarify, make adequate, and remind.
22
The motivation for Wu and Lytinen’s (1990) taxonomy is unclear. Indeed
the primary justification for the set of relations in various works on RST is that the
relations posited are descriptively adequate. Mann and Thompson (1988:259), for
example, cite the thousands of clauses that they have successfully analyzed from a
range of genres as evidence for the efficacy of RST. Others (Sanders 1992;
Sanders et al. 1992, 1993; Knott and Dale 1995; Sanders and van Wijk 1996) find
descriptive adequacy to be an unsatisfying primary justification for a set of
rhetorical relations, and instead prefer to view relations as psychological
constructs. Knott and Dale (1995), for example, note that “descriptive adequacy”
is only meaningful if there is a clear purpose for which the descriptions must be
adequate. Moreover, they observe that the sets of relations posited by researchers
within RST are not as diverse as might be expected if descriptive adequacy were
the only criterion, suggesting that analysts are relying on their intuitions in
formulating plausible sets of relations.
Sanders, Spooren and Noordman (1992, 1993), propose a set of cognitive
primitives that can be combined to yield various classes of discourse relations.
Finer distinctions could be made by adding parameters. The primitives concern
the causal nature of the relation, whether the relation is coherent on semantic or
23
pragmatic grounds, whether a relation has a basic order, with the antecedent on
the left, or a non-basic order, and whether the polarity of the relation is positive
or negative (involving a violation of expectations). Sanders, Spooren and
Noordman tested the validity of their relations by having analysts apply the
relations to texts, and by having non-linguists decide among cue-phrases for texts.
Knott and Dale (1995) criticize the basis of some of Sanders et al.’s
parameters, especially the notion of basic order. Knott and Dale also criticize the
coarse-grained classes of relations described by Sanders et al.’s parameters, and
the notion that adding more parameters would continue to yield neatly divided
relations. Despite these criticisms, Knott and Dale still prefer to view relations of
the type posited in RST as psychologically valid. Whereas Hovy (1990) proposes
using the generalizability of cue words and phrases simply as a test of the validity
of a taxonomy of discourse relations, Knott and Dale use this test as a means to
construct a taxonomy of discourse relations from the ground up, noting that
“linguistic devices (in particular, cue phrases) can be taken as evidence for
relations, provided these are conceived as constructs which people actually use
when creating and interpreting text” (Knott and Dale 1995:46). They comment
that
24
“Studying the means available for marking relations in a given
language should be able to tell us about the relations which people
actually make use of. The methodology might be described in
Hallidayan terms, as using the cohesive devices a language affords
as evidence for a psychological theory of text coherence.” (Knott
and Dale 1995:45, original emphasis)
Knott and Dale propose a method for isolating cue phrases and a method for
testing the generalizability of those cue phrases, and then construct a taxonomy of
discourse relations. They find analogues of the original RST relations SEQUENCE,
CONTRAST, CIRCUMSTANCE, CAUSE and RESULT. Interestingly, they find no basis
for the distinction in RST between VOLITIONAL-RESULT and NON-VOLITIONAL-
RESULT, no cue phrases associated with EVALUATION or BACKGROUND, and no
single phrase associated with ELABORATION.
In section 3.6 I outline and motivate the set of relations used in the present
study, and suggest how the relations might be organized and applied within what
I will term an “underspecified” view of RST.
25
3.6 Underspecified Rhetorical Structure Theory
Vander Linden (1993:6) observes that “Instructional text tends to have a
fairly simple intentional structure, and a more complex rhetorical one”. Similarly,
Maier and Hovy observe that
“In most cases, we believe, ideational and textual relations are
subordinated to interpersonal ones (that is, they structure a
discourse that is motivated by and fulfills an interpersonally
related communicative function). … In general, an interpersonal
text plan is pursued until, at some point, it bottoms out into a call
for the presentation of information…in the extreme case it is even
possible that a whole text is governed by a single DESCRIBE, as
with encyclopedia entries.” (Maier and Hovy 1991:6, emphasis
added)
This is certainly true of articles in Encarta 96. Articles are structured almost
exclusively in terms of ideational and textual relations subordinated to a speech
act like DESCRIBE or EXPLAIN.
26
Of the original set of RST relations (Mann and Thompson 1986, 1988) the
following interpersonal relations do not appear to be needed at all for an adequate
analysis of articles in Encarta 96: ANTITHESIS, ENABLEMENT, EVALUATION,
INTERPRETATION, MOTIVATION, and SOLUTIONHOOD. In fact the only
interpersonal relation which is needed for an analysis of articles in Encarta 96 is
CONCESSION.
One criticism of RST is that a text can simultaneously have both an
intentional and an informational representation (Ford 1986; Moore and Pollack
1992), but that these intentional and informational representations will not
necessarily have the same structure. RST is not able to represent this possible
mismatch between intentional and informational representations, since it requires
that an analyst choose one relation to relate two text spans, necessitating a choice
between either an intentional relation or an informational relation. For the articles
in Encarta 96, since ideational and textual relations predominate, this criticism is
of less concern.
At least the following thirteen relations appear to be needed for the
analysis of Encarta 96 articles: ASYMMETRICCONTRAST, CAUSE, CIRCUMSTANCE,
CONCESSION, CONDITION, CONTRAST, ELABORATION, JOINT, LIST, MEANS,
27
PURPOSE, RESULT, SEQUENCE. As per Knott and Dale (1995), I do not distinguish
VOLITIONAL RESULT from NON-VOLITIONAL RESULT. I also do not distinguish
VOLITIONAL CAUSE from NON-VOLITIONAL CAUSE. These thirteen relations are a
relatively uncontroversial common subset of all the rhetorical relations that have
been proposed (Hovy 1990 and the references therein). Not only are these
relations uncontroversial, they are also ones that can reliably be identified by
automatic means (chapter 6). The approach and insights outlined in this
dissertation would not be nullified should a different set of relations be used. For
example, if the CAUSE relation were broken down into VOLITIONAL CAUSE and
NON-VOLITIONAL CAUSE (as per Mann and Thompson 1988) and if linguistic
cues could be found which reliably identified each of these two relations, then the
remaining architecture outlined below would remain unchanged. In particular, the
algorithm that constructs RST trees on the basis of a set of hypothesized discourse
relations (chapter 7) is not sensitive to the peculiar attributes of specific relations,
and would therefore still operate if a different set of relations were used.
With the exception of the interpersonal relation CONCESSION, these
thirteen relations cannot be usefully distinguished by appeal to the three meta-
functions of language used by Maier and Hovy (1991). Instead, I propose the
28
following simple taxonomy which makes a two-way top-level distinction between
symmetric and asymmetric relations:
Figure 5 Taxonomy of discourse relations
29
In view of the discussion in section 3.5 concerning debates in the literature about
sets of relations and taxonomies of those relations, some discussion of the top-
level distinction between symmetric and asymmetric relations is in order.
Mann and Thompson (1988) suggest that the distinction between a
nucleus and a satellite reflects differences in the organization of text. The nucleus
is “more deserving of response” (Mann and Thompson 1988:270), while the
satellite gains its significance only in relation to a nucleus. In an asymmetric
relation, the decision to encode something as a nucleus reflects the relative
importance of the proposition in expressing the writer’s goals, as opposed to the
ancillary status of satellite material. In a symmetric relation, equal importance is
attached the propositions expressed in all the daughter nodes. In this sense, the
nodes in a symmetric relation can all be considered to be nucleic.
In Encarta 96, it is usually clear whether a text span stands as a nucleus or
a satellite to another text span, or whether in fact no direct discourse relation
holds between the two spans. It is occasionally less clear exactly what relation
ought to be posited. Marcu (1997a) makes a similar observation concerning the
RST trees constructed by two analysts for five small texts. The analysts tended to
agree about which nodes were nuclei and which were satellites, even if they
30
differed in the labels they assign to the relationships linking those nodes. This
suggests that the task of constructing discourse representations can be broken
down into two components: identifying whether a symmetric or asymmetric
relation holds, and labeling that relation. For a computational system, it is
desirable to avoid the construction of great numbers of trees which have the same
shape but differ only in their labeling. Rather than construct many trees with the
same structure, RASTA represents these alternatives as a list of labels on a given
node. The use of a list of labels is intended to represent the indeterminacy among
the various labels, not to suggest that all of the labels apply simultaneously. In
this sense, the trees are underspecified: the overall structure of the tree is given,
but labeling can vary from determinate, i.e. a single label is the most plausible to
less determinate, i.e. multiple labels are plausible. There is a third, albeit rare,
possibility: no label is plausible, the implications of which I consider now.
In rare instances, it is not clear that any label is appropriate, although a
symmetric versus asymmetric distinction can be made. In these cases, RASTA
labels the relation with a question mark, as illustrated in Figure 6.
31
1. The legs have powerful claws,
3. adapting the animal for rapid digging into hard ground.
Figure 6 Echidna
This then raises the question of why a simple distinction between symmetric and
asymmetric relations is not sufficient. That is, why does RASTA attempt a finer-
grained analysis? The attempt to identify meaningful labels for relations wherever
possible is motivated by the uses to which the output of a computational system
like RASTA might be put (chapter 9). Text summarization, information retrieval,
and the extraction of information from written text would all benefit from
meaningfully labeled relations. For example, to locate a section of text within a
document which might answer the question “Why…” it is useful to distinguish a
CAUSE relation.
Much of the emphasis in the literature on constructing elaborate
taxonomies is motivated by issues to do with the computational generation of
32
natural language, in which text planners make ever finer decisions concerning the
organization of material, until they terminate in decisions about specific
grammatical encoding. In contrast to this, discriminating among a relatively small
number of discourse relations can be achieved by simply attempting to recognize
each discourse relation, or at least recognizing whether there is a symmetric or
asymmetric relation. This process can be carried out without reference to
elaborate taxonomies.
3.7 Schemas
RST relations are organized into schemas. Mann and Thompson
(1988:247) give the following five schemas:
33
Figure 7 RST schemas (Mann and Thompson 1988:247)
Schema (1) represents what I term the asymmetric relations. Schemas (2),
(3) and (5) correspond to what I term the symmetric relations; the CONTRAST
relation is conversive, whereas the JOINT and SEQUENCE relations are not. The
JOINT relation does not posit a contentful relationship between its daughter nodes,
and so lacks the arcs connecting those nodes. Finally, schema (4) is of a type not
34
found in Encarta 96. Fox (1987) describes another schema which she calls an
Issue, consisting of a nucleus and several satellites. It must be emphasized that
Fox is consistent with Mann and Thompson in viewing these schemas as a
structural classification of the classes of relations, rather than as a notation for
representing recurrent combinations of relations in discourse. Neither Fox nor
Mann and Thompson suggest (to give a hypothetical example) that a
CIRCUMSTANCE relation is more likely to hold between a simple text span and a
text span with the internal structure of a SEQUENCE than between a simple text
span and a text span which is the rightmost node in a SEQUENCE relation, i.e. that
example (1) in Figure 8 is more likely than example (2):
Figure 8 Alternative discourse structures
35
Sumita et al. (1992) propose restrictions on thinking flow. These are linear
sequences of relations which are held to be indicative of well-formed RST trees,
and which are used to constrain the number of RST trees which are constructed
for a text. The sequences of relations which they propose appear to be hand-
crafted, based on the intuitions of linguists. Although Sumita et al. claim
improvements in their system resulting from the application of these thinking
flow restrictions, it is not clear that the restrictions are empirically well motivated
(see section 4.2).
It ought to be possible to identify recurrent configurations of RST relations
for a given genre. For example, it may turn out to be the case that for
encyclopedia articles about historical battles, there is frequently a SEQUENCE
relation between several clauses, with a RESULT relation modifying the last of the
nuclei in the SEQUENCE. A set of configurations could be used to constrain the
discourse structures created by RASTA. Since the identification of such recurrent
configurations requires analyses of a great many texts, I suggest this as a
promising avenue for future research given a reliable automated means of
computing discourse representations (chapter 9). Since RASTA is able to reliably
compute representations of discourse structure without reference to such macro-
36
structures, the main benefit of schemas would be to improve the efficiency of
RASTA by constraining the range of structures that it might consider in arriving at
the preferred analysis.
3.8 Conclusion
For a limited domain, namely articles in Encarta, a set of thirteen
rhetorical relations suffices. For the task of constructing plausible representations
of discourse structure, it is not necessary to devise an elaborate taxonomy for
those relations. Rather, a simple distinction between symmetric and asymmetric
relations suffices.
The theoretically problematic issue of the occasional uncertainty
concerning the appropriate label to apply to a relation motivates an underspecified
representation. This underspecified representation is computationally attractive
because it allows a condensed representation of multiple trees having the same
structure but differing in labeling. The identification of recurrent configurations
for specific genres might lead to novel ways to constrain the search for a
preferred RST analysis for a text, but is not essential for RASTA to reliably
construct RST representations.
37
4. Previous Work on Computing Discourse
Representations
4.1 Introduction
In the literature, there are few descriptions of algorithms for computing
discourse representations, and still fewer descriptions of implementations of such
algorithms. In the following sections I briefly review this work.
4.2 Rhetorical Structure Theory
4.2.1 Mann and Thompson (1986, 1988)
Mann and Thompson (1986, 1988) recognize that rhetorical relations are
often signaled by cue words and phrases, but emphasize that rhetorical relations
can still be discerned even in the absence of such cues. From this it follows that
for written text in general any attempt to construct a representation of discourse
solely on the basis of cue words and phrases is doomed to failure. Despite this
pessimistic prognosis, various researchers have attempted to model rhetorical
structure, sometimes solely on the basis of such superficial cues.
38
4.2.2 Sumita et al. (1992) and Ono et al. (1994)
Researchers at Toshiba Corporation (Sumita et al. 1992; Ono et al. 1994)
analyzed Japanese and English texts and attempted to construct representations of
discourse structure based on referential continuity (determined by simple lexical
repetition) and cue words and phrases. Their analysis appears to be based on
extremely simple pattern matching of strings, rather than a full syntactic analysis.
A flat structure is produced, representing the relations between adjacent
sentences. Hierarchical structure is then constructed over this flat representation
according to constraints on thinking flow, defined as plausible sequences of
relations. Sumita et al. give an example of one such thinking flow restriction:
“Consider the sequence [P <EG> Q <SR> R], where P, Q, R are
arbitrary (blocks of) sentences. The premise of R is obviously not
only Q but both P and Q. Since the argument in P is considered to
close locally, the two should be grouped into a block.” (Sumita et
al. 1992:1134, EG = exemplification, SR = serial connection)
As noted in section 3.7, these thinking flow restrictions apparently result from the
intuitions of linguists, rather than being deduced empirically. In addition to these
39
thinking flow restrictions, a set of template strings is used to evaluate discourse
structures. An example of such a template string is “…? …? The reason is, …”
(Sumita et al. 1992:1135), a shorthand notation which is interpreted as a string of
characters ending in a question mark, followed by another string of characters
ending in a question mark, followed by the string “The reason is,”. Again, these
templates are apparently based on the intuitions of linguists, rather than being
deduced from empirical analysis of texts. Although Sumita et al. (1992) compare
the output of their system to human analyses they do not describe the specific
contribution of the thinking flow restrictions and template strings in guiding their
system to plausible analyses of text structure.
The data for the Toshiba research are newspaper articles and short
academic articles. Ono et al. (1994) observe that the academic articles contain
many more cue phrases than the newspaper articles, which enables them to more
accurately construct representations of discourse structure, and therefore to
construct qualitatively better summaries based on those discourse representations.
40
4.2.3 Kurohashi and Nagao (1994)
In a similar vein to the research at Toshiba, Kurohashi and Nagao (1994)
create discourse structures which (to judge by their illustrations) are similar to the
structures posited by RST. They create these structures by examining cue words,
topic-chains identified by lexical repetition, and by metrics that measure the
similarity for two sentences (apparently determined by word repetition, thesaural
relations and patterns of sequences of parts of speech). Kurohashi and Nagao base
their research on an odd data-set: translations into Japanese of articles in an
English language popular science magazine. They give few details of their
system, how they measure similarity, how they construct hierarchical
representations once they have identified cues to discourse structure, or even
exactly what those cues are.
4.2.4 Fukumoto and Tsujii (1994)
Fukumoto and Tsujii (1994) sketch a formalism for constraining the
selection of one of four interpersonal relations (BACKGROUND, ENABLEMENT,
EVIDENCE, MOTIVATION) according to the tense, aspect and modality of the
clauses between which a relation is being posited. Unfortunately, the formalism
41
involves subjective evaluation of such things as whether the outcome of a
situation is “good” or “bad” (Kurohashi and Nagao 1994:1182). Kurohashi and
Nagao’s examples of the application of this formalism to a text appears to be
based on a hand-analysis. Although some aspects of the identification of RST
relations are made explicit, the formalism does not appear to be sufficiently
explicit for a computational implementation. Finally, Kurohashi and Nagao do
not present a general method for constructing RST representations for a text once
relations have been identified by employing their formalism.
4.2.5 Wu and Lytinen (1990)
Wu and Lytinen (1990) briefly describe the BUYER system, which deduces
coherence relations from a propositional representation of an advertisement.
Although details of their system are sketchy (their description of the control flow
of their system contains such steps as “Decide implicational or semantical
relations and coherence relations.’, Wu and Lytinen 1990:508), it does not appear
to contain explicit procedures for dealing with multi-nucleic relations, any way to
decide among alternative possible coherence relations, nor any way to evaluate
alternative trees that might be constructed.
42
4.2.6 Marcu (1996, 1997a)
Marcu (1996) provides a first-order formalization of RST trees, along with
an algorithm for constructing all the RST trees compatible with a set of
hypothesized rhetorical relations for a text. Marcu employs the notion of
nuclearity in developing his algorithm for constructing RST trees. As Marcu
observes, two adjacent text spans can be related by an RST relation if and only if
that relation holds between the nuclei of the two text spans; satellites of the text
spans do not enter into the determination of this relationship. RST trees can thus
be assembled from the bottom up by joining text spans whose nuclei have been
posited to be potentially in some rhetorical relationship. Given a set of rhetorical
relations that might hold between pairs of RST terminal nodes, Marcu’s algorithm
will produce all of the valid RST trees which are compatible with the relations
posited.
Marcu’s algorithm suffers from combinatorial explosion–as the number of
relations increases, the number of possible RST trees increases exponentially.
Marcu first produces all possible combinations of nodes according to the relations
posited and then filters ill-formed trees.
43
Marcu (1996) leaves two questions unanswered. First, on what basis might
a computational system posit the relations that the algorithm then uses as the basis
for constructing RST trees? Second, what criteria should a computational system
use for evaluating alternative well-formed RST trees in an effort to determine
which trees might be more plausible? Marcu (1997a) attempts to answer both of
these questions.
Marcu (1997a) identifies cue phrases that are compatible with various
rhetorical relations, distinguishing rhetorical uses of those phrases from sentential
uses. Marcu identifies these cue phrases in a text by means of a shallow analysis,
essentially pattern matching based on regular-expressions. In the process of
identifying the cue phrases, Marcu also identifies clause boundaries. On the basis
of the cue phrases, Marcu’s algorithm posits rhetorical relations between the
clauses identified. These rhetorical relations are then used to assemble RST
representations as per Marcu (1996). Finally, Marcu’s algorithm evaluates the
RST trees constructed according to a metric that favors trees that skew to the right
(see below, this section).
Although Marcu’s algorithm for constructing RST representations
represents a considerable advance, it is not without its problems. Some of these
44
problems, discussed below, result from an over-reliance on cue phrases and the
use of pattern matching to identify cue phrases and terminal nodes. These
problems are perhaps true of other similar methods described in the literature (in
particular, Ono et al. 1994 and Sumita et al. 1992), but Marcu gives the clearest
description of these techniques and provides an examination of their efficacy.
Marcu’s method for evaluating trees is perhaps not sufficiently genre-
independent, while the algorithm for constructing trees suffers from
combinatorial explosion—as the number of hypothesized relations increases, the
number of well-formed trees produced increases exponentially.
An overreliance on cue phrases as evidence for discourse structure makes
it difficult to ensure that a computational discourse analyzer will be able to
construct a representation that completely covers the text (Mann and Thompson’s
criterion 1, section 3.3). As Mann and Thompson (1986, 1988) note (see section
4.2.1), rhetorical relations can be discerned even in the absence of cue phrases. If
most clauses contained cue phrases that could be used as evidence for discourse
structure, then a computational discourse analyzer might still be able to achieve
analyses of large fragments of a text by relying exclusively on cue phrase
identification. Redeker (1990) examines transcripts of oral retellings of films, and
45
finds that approximately 50% of all tensed clauses contain cue phrases that
function as discourse markers. Marcu (1997a:97) interprets this percentage as
“sufficiently large to enable the derivation of rich rhetorical structures for texts.”
Leaving aside the differences that might exist between oral and written texts, a
more pessimistic evaluation might lead to concern about the clauses that do not
contain cue phrases. Even in academic discourse, a genre where we might expect
a high density of cue phrases, it is clearly not the case that every clause contains a
cue phrase that explicitly indicates its discourse relation to other clauses. Marcu’s
algorithm contains no criteria for positing rhetorical relations in the absence of
cue phrases. (Of course, cue phrases, when present, are a compelling form of
evidence for identifying discourse structure. RASTA also identifies cue phrases,
but uses additional cues in the identification of discourse relations, as described in
chapter 6).
Identifying cue phrases by means of regular expressions yields a fairly
high degree of accuracy. Using the terminology of information retrieval, Marcu
measures recall (the number of things judged by humans to be cue phrases that
were also identified by his algorithm) at 80.8% for the 275 cue phrases identified
manually in a test corpus, and precision (the number of cue phrases identified by
46
his algorithm that a human actually judged to be cue phrases) at 89.5%. The value
of 89.5% for precision can be attributed to the incorrect identification of cue
phrases as having a discourse function when in fact they had a sentential function,
such as coordinating two noun phrases.
Although pattern-matching is generally computationally inexpensive, it
has two problems. The first problem concerns the compositionality of cue
phrases. In Encarta, there are some sequences of words that function in certain
contexts as cue phrases equivalent to single lexical items, but in other contexts as
phrases whose internal structure is important. In the following example, the
phrase as long as ought to be treated as having internal syntactic structure.
…their observed light would have been traveling practically
as long as the age of the universe. (Quasar)
In contrast, Figure 9 illustrates the same sequence of words as long as acting as a
subordinating cue phrase. The MEG system, in the course of performing a
syntactic analysis, correctly distinguishes the compositional analysis above from
the analysis in Figure 9, in which the cue phrase acts as a single lexical item (see
section 6.7.5). A pattern matching approach like the one described by Marcu
would have difficulty dealing with such cases.
47
1. The premier and cabinet remain in power
4. as long as they have the support of a majority in the provincial
legislature.
Figure 9 Prince Edward Island
A second problem with an approach based on pattern matching concerns
the identification of terminal nodes for a discourse analysis. Many of the terminal
nodes in Marcu’s (1997a) diagrams, for example, are not clauses, and would not
be treated as terminal nodes in a conventional RST analysis. For example, for the
first sentence of an excerpt from Scientific American, Marcu’s (1997a:101)
procedure selects the following three terminal nodes (the second node is offset in
the original sentence by em dashes):
48
1. With its distant orbit—(2)—and slim atmospheric blanket,
2. 50 percent farther from the sun than the Earth
3. Mars experiences frigid weather conditions.
(Marcu 1997a:101)
Marcu’s nodes (1) and (2) would certainly not be selected as clauses with
“independent functional integrity” (Mann and Thompson 1988:248). Marcu does
not discuss the fact that some of the nodes identified by his algorithm differ in
kind from those conventionally identified in RST. Clearly, however, Marcu’s
regular-expression based approach to identifying clause boundaries is not without
its problems.
Marcu (1997a:99) claims an average recall for the clause identification
procedure of 81.3% (i.e. 81.3% of the clauses identified by humans were also
correctly identified by his procedure), noting that it was particularly difficult to
distinguish sentential versus non-sentential uses of the conjunction and. In
Marcu’s data, the missed discourse uses of and tend to correspond to SEQUENCE
and JOINT relations. Missing these uses therefore tends not to materially affect the
analysis, but rather to lead to an RST analysis of a coarser granularity. The
49
precision of the clause identification procedure Marcu gives as 90.3%, although it
is not clear whether he counts the unusual terminal nodes mentioned above as
adding to or subtracting from the precision.
Concerning the evaluation of alternative RST analyses, Marcu (1997a)
claims that right-branching structures ought to be preferred because they reflect
basic organizational properties of text. In fact, the success of this metric reflects
the genre of Marcu’s three test excerpts. Two of the test excerpts are from
magazines, which are widely known to have a concatenative structure, as Marcu
(1997a:100) himself observes. The third text is a brief narrative, whose right-
branching structure is perhaps a reflection of iconic principles of organization
(Haiman 1980). In a narrative, the linear order of foreground clauses matches the
temporal sequence of events (Labov 1972, Polanyi 1982). Narratives can thus be
said to “unfold” in a right-branching manner.
Finally, combinatorial explosion is still a significant problem in the more
complete algorithm described in Marcu (1997a). As the number of hypothesized
relations increases, the number of well-formed trees compatible with those trees
increases exponentially. The final output of Marcu’s algorithm is a list of trees
ranked according to his metric. In order to obtain these rankings, however,
50
Marcu’s algorithm might have to produce great numbers of dispreferred trees.
The production of these dispreferred trees is essentially wasted computation.
4.3 PISA
Sanders and van Wijk (1996), whose research focuses on the mental
representation and processing of texts, side with those who believe that coherence
relations in discourse ought to be cognitively motivated (Sanders 1992; Sanders et
al. 1992, 1993; Knott and Dale 1995). In order to construct representations of
texts, Sanders and van Wijk want a theory of discourse relations that is
sufficiently explicit to allow a representation to be constructed directly from a
text, without extensive reference to real world knowledge or to the intuitions of
the analyst. They note that:
“Rhetorical Structure Theory approaches text structure in a rather
static way. An analysis always starts from an inspection of the
entire text. The analysis does not proceed in a fixed order; it can be
applied bottom-up (from relations between clauses to the level of
text), top-down (from text to clause level) or following both routes
51
(Mann et al., 1992)… Rhetorical Structure Theory lacks a
procedure.” (Sanders and van Wijk 1996:94)
It is clearly not the case that readers or hearers suspend their analysis of a
text until they have inspected it in its entirety. Therefore, Sanders and van Wijk
want a procedure which is able to incrementally analyze a text, integrating
successive utterances into an emerging representation of discourse structure. They
develop an algorithm which they call PISA, Procedure for Incremental Structure
Analysis.
Within PISA, the text is first parsed and tagged. Analysis then proceeds
one text segment at a time, asking the following four questions:
52
“1. What segment features underlie its connection to the text?
2. To which other segment does the segment connect?
3. What is the hierarchical position of this connection?
4. What is the relational meaning of this connection?”
(Sanders and van Wijk 1996:99)
Sanders and van Wijk claim that their algorithm is able to proceed on the
basis of superficial linguistic evidence, and that it does not require explicit
reference to real-world knowledge. We would therefore expect a computational
implementation of their procedure to be a straightforward affair. Although
Sanders and van Wijk have not implemented this procedure themselves, they
mention (Sanders and van Wijk 1996:endnote 3) the work of Els van der Pool,
who has implemented PISA in Common Lisp. Unfortunately, no details of this
implementation are given.
Sanders et al. (1992) propose a set of cognitive primitives that can be
combined to yield a taxonomy of discourse relations. Sanders et al. distinguish
53
four primitives for use as parameters in defining discourse relations: whether the
relationship between propositions is causal or additive (conjunctive); whether the
source of coherence is semantic or pragmatic; whether the order of segments is
“basic” or “non-basic”; whether the relation is positive (e.g. in English typically
signaled by such conjunctions as and or because) or negative (e.g. in English
typically signaled by such conjunctions as but or although). By varying each
parameter, Sanders et al. derive of a set of twelve discourse relations, including
CAUSE-CONSEQUENCE, CLAIM-ARGUMENT and CONCESSION. I join with Knott
and Dale (1995) in doubting that discourse relations can or even ought to be
parameterized in such a neat manner. However, the system described in this
dissertation is in philosophical agreement with two of the guiding principles of
PISA: that a discourse processing module can be based on superficial linguistic
evidence and that it does not need to make explicit reference to real-world
knowledge.
4.4 Hobbs (1979)
Hobbs (1979) outlines a model for inferring coherence relations on the
basis of predicate calculus-like representations of the propositional content of
utterances. The length of chains of inference required to process a text correlates
54
inversely with the coherence of a text, i.e. the more work needed to understand a
text, the less coherent it is.
Hobbs represents superset relations, common world knowledge, and
lexical decomposition by means of axioms, representing “those things a speaker
of English generally knows and can expect his listener to know” (Hobbs
1979:71). Unfortunately, Hobbs does not implement his model, and does not give
principled ways in which such axioms could be acquired and maintained, nor
ways in which linguistic form might constrain the reasoning process. The efficacy
of his proposals is therefore difficult to evaluate.
Predicate calculus-like representations of the propositional content of a
text are insufficient for an analysis within the framework of RST. As noted in
section 3.4, an RST analyst proposes a judgment of the plausibility of suggesting
that the writer intended a certain effect (Mann and Thompson 1988:245). Of
course, the writer may choose to represent the same propositional content in
different forms in order to achieve a desired effect, as the constructed examples in
Figure 10 illustrate. Figure 10(a) reflects the decision to report an event, he ate
dinner, with a secondary event, After John went home, merely serving to provide
a temporal setting for the main proposition. This decision is reflected in an
55
asymmetric relation, the CIRCUMSTANCE relation. Figure 10(b) reflects a decision
to report these two events in a narrative sequence, reflected in the use of the
symmetric relation SEQUENCE.
(a)
1. After John went home
5. he ate dinner.
(b)
1. John went home
6. and then he ate
dinner.
56
Figure 10 Alternative formulations of the same propositional content
Although the text of the two examples presented in Figure 10 would
receive the same predicate-calculus representation, we do not want to give these
two mini-texts the same RST analysis. Clearly the predicate-calculus
representation is insufficient, and must be augmented by a consideration of the
form in which the author chose to express this content.
4.5 The Linguistic Discourse Model (LDM)
Polanyi (1988) proposes a procedure for constructing representations of
discourse structure in a bottom-up, left-to-right fashion. Within the emerging
discourse representation, some nodes are open (i.e. possible attachment points,
available for expansion), while others are closed. A new discourse segment is
compared to each of the available attachment points, and a “semantic
congruence” is computed, to decide to which node the new discourse segment
ought to be attached.
Within the Linguistic Discourse Model there are macro-discourse
structures such as jokes, plans, lists and casual conversations. Each of these has
a formal description of its constituent structure and interpretations. These
macro structures organize the information into a speech event. Speech events in
turn are organized into Interactions. The Linguistic Discourse Model, with its
different types of organization at different levels is thus unlike RST, which has
a single kind of organization for all relations from the clause to the entire
discourse.
Integrating new discourse segments into the emerging discourse
representation involves decisions about whether to subordinate or coordinate
the current discourse segment to an attachment point. These decisions are based
on real world knowledge and inferential processes, the nature and extent of
which are not specified. This unconstrained appeal to real-world knowledge
and inferential processes is a serious impediment to a computational
implementation of the Linguistic Discourse Model. The LDM does not appear
to have been implemented within a computational system.
4.6 Conclusion
From the research described in the preceding sections, three strands
emerge. The first strand (Knott and Dale 1995; Kurohashi and Nagao 1994;
Marcu 1997a; Ono et al. 1994; Sanders 1992; Sanders et al. 1992, 1993;
Sanders and van Wijk 1996; Sumita et al. 1992) concerns the identification of
discourse relations by fairly superficial means—typically simple pattern
matching to identify cue phrases.
The second strand (Fukumoto and Tsujii 1994; Hobbs 1979),
diametrically opposed to the first strand, eschews any examination of the form
of a text in favor of more abstract representations, even augmenting linguistic
representations with axiomatic representations of world knowledge.
The third strand concerns programmatic descriptions of how
computational discourse analysis might proceed (Polanyi 1988; Wu and
Lytinen 1990). The broad strokes of the design of a computational discourse
analyzer are described, but no specific details are given for such essential steps
as the actual identification of discourse relations.
RASTA is most closely aligned with the first of these strands, since it
hypothesizes discourse relations on the basis of the form of a text, without
reference to additional modeling of world knowledge. Unlike previous work
within this strand, RASTA goes beyond identifying cue phrases by means of
simple pattern matching and considers other evidence from a linguistic analysis
of a text, such as tense, aspect, polarity and referential continuity of noun
phrases.
The cues identified by RASTA are discussed at length in chapter 6. The
identification of those cues is dependent on an in-depth linguistic analysis of a
text. RASTA relies on various components of the Microsoft English Grammar
for this linguistic analysis. The Microsoft English Grammar, a mature rule-
based parser with a broad coverage of the English language, and with
considerable resources available to a discourse processing module, is briefly
described in chapter 5.
5. The Microsoft English Grammar
5.1 Introduction
A brief discussion of the Microsoft English Grammar (MEG) is
necessary to provide sufficient context for understanding the role of RASTA
within a computational linguistic system and to demonstrate that there is
sufficient scaffolding to support the task of identifying discourse structure. The
work of the author concerns the discourse component (the focus of this
dissertation), and two facets of the “logical form”: anaphora resolution and
some aspects of the handling of ellipsis. All other aspects of the system
described here are the work of the other members of the Natural Language
Processing Group at Microsoft Research. More complete descriptions of
aspects of the MEG system can be found in Dolan et al. (1993), Pentheroudakis
and Vanderwende (1993), Richardson et al. (1993), Dolan (1995),
Vanderwende (1995a, 1995b) and Richardson (1997). The philosophically
similar PEG system is described in various papers in Jensen et al. (1993).
MEG is a research environment for computational linguistics that runs
under the Microsoft Windows 95 and Windows NT operating systems on
conventional personal computers. The MEG system itself is written in a
combination of the C programming language and a proprietary programming
language called G.3 Systems-level functions are written in C, while the
grammar and portions of the run-time system are written in G. The G
programming language is conceptually an amalgam of C (from which it
particularly derives its syntax and control structures) and Lisp (from which it
particularly derives the notion of the list as a basic data-type), and provides a
formalism to enable linguists to express linguistic rules.
MEG contains a broad-coverage domain-independent grammar of
English capable of processing sentences in a fraction of a second on a
conventional Pentium-based personal computer. Work is currently in progress
at Microsoft Research to develop systems comparable to MEG for the analysis
of Chinese, French, German, Japanese, Korean and Spanish.
The MEG system has a serial architecture, with components
corresponding to the lexicon, syntax, “logical form” and discourse. The
following sections describe these various components.
5.2 Lexicon
The lexical component tokenizes the input string, identifying word
boundaries, separating out clitics, identifying multi-word expressions such as in 3 Exactly what G stands for is the subject of speculation, puns, and general confusion.
order to, and analyzing factoids (minor phrasal constituents such as proper
names or numbers written out in full). The lexical component also contains a
finite-state processor that analyzes or generates derivational and inflectional
morphology (Pentheroudakis and Vanderwende 1993), attaching syntactic and
semantic features to words by a combination of rule-based analysis and
dictionary lookup.
5.3 Syntax
A syntactic analysis follows the lexicon component. The syntactic
analysis consists of two phases, referred to as sketch and portrait.
5.3.1 Sketch
During the sketch phase, constituents are assembled in a bottom-up
fashion to form a syntactic parse with a fairly conventional constituent
structure. The grammatical rules that perform the parsing use only the
information provided by the lexical component. During the sketch phase, rules
only have access to very local structure, and are frequently unable to resolve
syntactic dependencies, such as prepositional phrase attachment. The sketch
therefore defaults to a right attachment for ambiguous syntactic dependencies,
and notes other possible attachment points, rather than producing a multitude of
trees. The output of the sketch component is thus a “packed” parse that
indicates uncertain syntactic dependencies, leaving them for subsequent
components to resolve.
Figure 11 illustrates the sketch produced for the sentence I ate a fish
with a fork, a hoary chestnut of computational linguistics (Jensen and Binot
1987:2524). Heads of constituents are indicated with asterisks. In Figure 11, the
prepositional phrase with a fork has defaulted to a right attachment, subordinate
to the NP. The ?1 notation indicates another possible attachment point for the
prepositional phrase as a sister of the NP.
Figure 11 Syntactic sketch produced by MEG
4 The analysis produced by MEG differs from that of the philosophically similar PEG
system described by Jensen and Binot (1987) only in the label of the final period–in the PEG
system this character would have been labelled PUNC (i.e. punctuation) rather than CHAR (i.e.
character).
Figure 11 is only a visualization of a rich underlying data structure. This
data structure consists of a list of attributes and their values, where those
attributes may have complex structures as their values. Figure 12 illustrates
some of the attributes and their values in the data structure used to represent the
root node of the tree given in Figure 11.5 From this data structure, it is clear
that the example has been parsed as a declarative sentence (the Segtype
attribute has the value SENT and the Nodetype attribute has the value DECL),
with the subject I and the object a fish with a fork.6 Both the subject and object
are themselves complex data structures. The lexical component analyzed the
main verb ate as a morphological variant of the base form eat, and in the
process provided information like the morphological feature Past present in the
attribute called Bits. This sentence has as its head the verb ate, with material
that precedes the head (Prmods) and material that follows the head (Psmods).
As Jensen (1993:31) notes concerning the similar analyses produced by the PEG
system: “PEG’s trees, with their heads and modifiers, have the flavor of a
dependency grammar.”
5 Some attributes have been omitted simply for the sake of brevity.
6 This snapshot of the data structure was taken before MEG had resolved the
prepositional phrase attachment of with a fork. Subsequent processing determines that object
is a fish.
Segtype SENTNodetype DECLNodename DECL1Ft-Lt 0-8String " I ate a fish with a fork ."Rules (Sent VPwNPl VPwNPr1 VERBtoVP)Constits (BEGIN1 VP1 CHAR1)Lex "ate"Lemma "eat"Bits Pers1 Sing Past Closed L9 X9 I0 T1 Loc_srProb 0.25645Prmods NP1 "I"Head VERB1 "ate"Psmods NP2 "a fish with a fork"
CHAR1 "."Subject NP1 "I"FrstV VERB1 "ate"Object NP2 "a fish with a fork"Predicat VP2 "ate a fish with a fork"Topic NP1 "I"
Figure 12 Underlying data structure for the sketch
For strings of words for which MEG cannot construct a plausible
syntactic analysis, it defaults to a fitted parse, i.e. the grammar assembles the
possible constituents into a simple branching structure. For the text of Encarta,
fitted parses are extremely uncommon.
5.3.2 Portrait
The portrait phase of the syntactic component of MEG refines the
syntactic analysis produced during the sketch phase by resolving ambiguous
syntactic dependencies using two strategies: syntactic reattachment and
semantic reattachment.
During syntactic reattachment, MEG performs a top-down traversal of
the syntactic tree, inspecting structural configurations to resolve syntactic
dependencies. As noted in section 5.3.1, the sketch phase operates in a bottom-
up fashion, and therefore has access to only limited context. Syntactic
reattachment, proceeding in a top-down fashion, has access to a much wider
context.
During semantic reattachment, MEG consults a semantic network,
MINDNET (section 5.7), to determine which of several possible syntactic
dependencies is semantically the most likely (Jensen and Binot 1987, Dolan et
al. 1993, Vanderwende 1995b). In the example I ate the fish with a fork,
semantic reattachment considers the various senses of the preposition with.
MEG looks in MINDNET to ascertain whether any relationships exist between eat
and fork or between fish and fork that are compatible with any of the senses of
the preposition. For this example, MEG finds a sense of fork compatible with
the instrument reading of the preposition with: a fork is a utensil used for
eating. MEG therefore resolves the syntactic dependency as illustrated in Figure
13, making the prepositional phrase a sister of the noun phrase. In the process
of resolving this attachment, MEG has implicitly performed word sense
disambiguation for three words: with, eat and fork. During the construction of
the logical form (section 5.4), MEG will use the sense information for the
preposition with to label a relationship as INSTR.
Figure 13 Syntactic portrait produced by MEG
5.4 Logical Form
The logical form component analyzes the syntactic portrait to produce a
graph structure. The logical form represents a normalized view of the predicate
structure of a text, with marked syntactic alternants being noted. For example,
active and passive structures receive the same structural representation in the
logical form, but the logical form derived from the portrait of a passive
sentence is annotated with a feature PASS. Figure 14 illustrates the logical form
derived from the syntactic portrait in Figure 13.
Figure 14 Logical form produced by MEG
Function words such as indefinite articles do not occur in the logical
form, but are instead represented by annotations on nodes. In Figure 14, the
indefinite article a in a fish is represented by the feature +Indef.
As Figure 14 shows, the logical form consists of nodes in labelled
relationships. The label INSTR results from the disambiguation of the
preposition with (section 5.3.2). A subset of the labels used in the logical form
is given in Figure 15. (The labels Dsub and Dobj represent a historical legacy
rather than a present-day commitment to earlier models of Transformational
Grammar. As the logical form matures, our intention is to replace labels like
Dsub with more semantic descriptions like Agent or Experiencer.)
In addition to the labels given in Figure 15, individual prepositions may
appear in the logical form as labels if those prepositions were not
disambiguated during the portrait phase of syntactic analysis. In Figure 16, for
example, the preposition in occurs as the label of a relation in the logical form.
Label Interpretation
Dsub “Deep subject”. (a) The subject of an active clause. (b) The agent of a passive or unaccusative construction.
Dobj “Deep object”. (a) The object of an active clause. (b) The subject of an unaccusative construction.
TmeAt A temporal relation. This same label is used for points in time as well as durations.
Instr Instrument.
Manr Manner.
LocAt Location.
Goal A spatial goal.
Figure 15 Labels used in the logical form
During the construction of the logical form, MEG resolves anaphoric
references and ellipsis, using heuristics that examine features assigned by the
lexicon component and by examining structural configurations in the syntactic
portrait. Figure 16 illustrates the resolution of the reflexive pronoun himself,
which MEG has identified as coreferential with the subject John.
Figure 16 Resolution of reflexive pronoun
In Figure 17, MEG has resolved the coreferential relationship between
John and the pronoun he. Of course, this example has an interpretation in which
he is not coreferential with John. MEG indicates this alternative possibility by
annotating the pronoun he with the feature +FindRef, an instruction to
subsequent stages of processing to consider possible coreferents outside of this
sentence.
Figure 17 Resolution of personal pronoun
As was the case with syntax trees (section 5.3.1), these illustrations of
logical forms are merely visualizations of a rich underlying data structure.
Figure 18 illustrates the data structure underlying the node drive1 in Figure 17.
The SynNode attribute contains a link back to the corresponding syntactic
constituent in the portrait tree. Since the logical form and the portrait tree are
linked in this fashion, RASTA is able to examine either the abstract logical form,
which has something of the flavor of a predicate calculus representation, or the
syntactic analysis.
Nodename drive1Rules LF_PrpCnjs LF_TmeAt LF_Dsub1 SynToSem1Constits drive1 drive1 drive1 DECL1Bits L9 MovSynNode " After John left work, he drove to the store."Pred driveDsub John1TmeAt leave1PrpCnjs store1PrpCnjLem after
Figure 18 Data structure underlying the node drive1
5.5 Word Sense Disambiguation
The component that follows the logical form performs word sense
disambiguation. This component examines the syntactic analysis and consults
MINDNET to identify the most likely senses of words in the logical form. The
end result of this analysis is a logical form in which nodes are annotated with
sense information.
The word sense disambiguation component applies optionally. For the
discourse research conducted to date, this component has not been applied.
5.6 Discourse
Finally, the discourse module, RASTA, attempts to identify rhetorical
relations based on an examination of the syntactic portrait and the logical form.
Having identified those rhetorical relations, RASTA then constructs
representations of discourse structure. Since the operation of RASTA is the topic
of this dissertation, it will not be described further here.
5.7 MINDNET
MINDNET is a large semantic network, which has been constructed
automatically (Dolan et al. 1993; Richardson et al. 1993; Dolan 1995;
Vanderwende 1995a, 1995b; Richardson 1997) by extracting semantic
information from two dictionaries: Longman Dictionary of Contemporary
English (Proctor 1978) and The American Heritage Dictionary (Houghton
Mifflin 1992). Currently, MINDNET consists of 120,000 head words, and
approximately seven million labeled arcs connecting lexical senses of those
head words.
MINDNET is not strictly speaking a component within the serial
architecture described in section 5.1. Rather, MINDNET is a resource consulted
by two components of MEG, the portrait phase of the syntactic component and
word sense disambiguation. RASTA does not explicitly consult MINDNET during
discourse processing. See section 8.3.
5.8 Conclusion
MEG provides a broad-coverage grammar of English that yields an in-
depth analysis of the syntactic structure of a text and a representation of aspects
of its propositional structure and semantics. MEG is thus an excellent
framework within which to conduct research on methods for automatically
constructing representations of the structure of written discourse. The following
chapters describe in detail exactly how RASTA operates within this framework.
6. Cues to Discourse Structure
6.1 Introduction
RASTA successfully identifies rhetorical relations by considering
evidence from a linguistic analysis of the text. Section 6.7 lists the cues used to
identify each relation, with illustrative examples. Lest the reader become mired
in the detail of individual relations, several larger issues raised by this list of
cues are discussed first. The clausal status of terminal nodes—whether they are
in a hypotactic or paratactic relationship—is a useful criterion for making the
coarse determination of whether an asymmetric or symmetric relation is most
likely (section 6.2). Similarly anaphora and deixis (section 6.3) play a crucial
role in making this determination.
Finally the enumeration of the cues used to identify discourse relations
is preceded by discussion of the architecture of the identification process—the
manner in which heuristic scores are associated with cues (section 6.4) and the
distinction between the necessary criteria and cues (section 6.5)—and a
consideration (section 6.6) of the dependence of the work presented here on the
particular set of thirteen relations that RASTA employs.
In identifying cues to discourse structure, it is important to emphasize
that I am not proposing an exhaustive list of all the linguistic correlates of each
of the rhetorical relations. Rather, the cues given below, comprising a relatively
small set of approximately fifty members, are ones that have proven to be
sufficient for distinguishing among the thirteen rhetorical relations employed in
this study.
6.2 Correlations between clausal status and rhetorical status
Following the Hallidayan tradition, Matthiessen and Thompson (1988)
distinguish between clause embedding (covering restrictive relative clauses, as
well as subject and object complements) and clause combining. Within clause
combining, they distinguish parataxis (including coordination, apposition and
quoting) and hypotaxis (including non-restrictive relative clauses, reported
speech, and other subordination of one clause to another). Observing the strong
analogue between the rhetorical organization of texts and the grammatical
organization of clauses, Matthiessen and Thompson propose that hypotactic
clause combining represents the grammaticization of asymmetric RST relations,
with the matrix clause corresponding to the nucleus of the RST relation and the
subordinate clause corresponding to the satellite. This proposal motivates the
most important discriminator of rhetorical relations employed by RASTA.
Hypotactic clause combining7, identified by the syntactic analysis performed by
MEG (section 5.3), always suggests an asymmetric RST relation in which the
matrix clause is posited to be the nucleus and the subordinate clause to be the
satellite. In cases that do not involve hypotactic combinations, for example in
considering the relationship between the main clauses of two sentences, either a
symmetric or an asymmetric rhetorical relationship may hold.
In rare cases, this correlation between clausal status and rhetorical status
is the only clue to discourse structure that RASTA is able to identify, e.g. having
correctly identified a hypotactic relationship, RASTA is unable to identify a
specific corresponding asymmetric rhetorical relation. In such cases, RASTA
proposes an asymmetric relationship which it then labels with a question mark,
as illustrated in Figure 19. Clause2 is clearly a satellite of Clause1. However, it
is not quite clear exactly what RST relation holds. The PURPOSE or RESULT
relations are weak candidates, but certainly not inviting enough to warrant a
commitment to either.
7 Non-restrictive relative clauses, one kind of hypotactic clause combining, are not
treated as base level textual units by RASTA (section 3.2).
1. The legs have powerful claws,
7. adapting the animal for rapid digging into hard ground.
Figure 19 Echidna
6.3 The role of anaphora, deixis and referential continuity
Anaphoric references and deixis, two strongly cohesive devices
(Halliday and Hasan 1976), are frequently examined by RASTA during the
identification of discourse relations. Often, it is sufficient to identify the form
of a referring expression. Pronouns and demonstratives, for example, are
frequently positively correlated with the satellite of an asymmetric relation (see
for example criterion 4, Figure 27; criterion 4, Figure 52; cue H6, Figure 53;
inter alia), especially when they occur as syntactic subjects or as modifiers of
subjects, and negatively correlated with the co-nucleus of a symmetric relation
(see for example criterion 6 for the JOINT relation, Figure 62). In other cases,
the form of a referring expression is insufficient, and RASTA must consider
referential continuity. The MEG system resolves pronominal anaphoric
references during the construction of the logical form. Although MEG is
sometimes able to identify a single antecedent for a pronoun, it often proposes a
list of plausible antecedents. In determining subject continuity, the most
important kind of referential continuity for identifying discourse relations,
RASTA considers whether the subject of one clause is one of the possible
antecedents of the subject of another clause. For a pronominal subject, RASTA
examines the list of proposed antecedents. For a subject modified by a
possessive pronoun, RASTA considers the proposed antecedents of the
possessive pronoun. For lexical subjects, RASTA considers simply whether the
head of the subject noun phrase of one clause is identical to the head of the
subject noun phrase of the other clause.8
6.4 Heuristic scores
Intuitively, some cues to discourse structure are more compelling than
others. To reflect this intuition, RASTA assigns numerical heuristic scores (with
8 MEG does not currently perform anaphora resolution for lexical noun phrases,
although we intend to develop a module for performing resolution for such noun phrases. Any
anaphoric resolution of lexical noun phrases performed by MEG would of course then become
available to RASTA for consideration. In Encarta, identity of the heads of subjects is a
remarkably effective method for determining subject continuity for two lexical noun phrases.
values ranging from five to 35) to each cue. The heuristic score for a
hypothesized discourse relation is equal to the sum of the heuristic scores of
each of the pieces of evidence that lead to positing that relation. In practice,
positing a discourse relation relies on observing the convergence of a number
of cues (compare to Litman and Passonneau 1995, who identify segment
boundaries in spoken discourse by observing multiple simultaneous signals).
Although I do not wish to claim that the cues and heuristics that prove
useful for computing representations of discourse structure are also
psychologically valid, to date the heuristic scores which have proven useful in
computing discourse representations have been in accord with my intuitions as
a linguist. For example, explicit cue words and phrases provide strong support
for hypothesizing a particular discourse relation, but syntactic structural cues
provide weaker evidence.
The fact that the heuristic scores accord well with linguistic intuitions is
theoretically satisfying. However, the scores are primarily motivated by
considerations of computational efficiency during the construction of RST trees
(chapter 7). Having posited discourse relations between terminal nodes, RASTA
proceeds to construct RST trees in a bottom-up manner (section 7.5). RASTA
applies the hypothesized relations with the highest heuristic scores first, thereby
converging on the best analysis of a text first. Less plausible analyses can be
produced by allowing RASTA to apply the hypothesized relations with lower
heuristic scores.
In section 7.5.5, I measure the effectiveness of the set of heuristic scores
given here and discuss the potential role of machine learning algorithms in
determining an optimal set of values.
6.5 Necessary criteria and cues
The process of hypothesizing discourse relations involves tension
between two competing concerns. On the one hand, it is desirable to postulate
all possible discourse relations that might hold between two terminal nodes, in
order to ensure that the preferred RST analysis is always in the set of analyses
produced by RASTA. On the other hand, considerations of computational
efficiency lead us to desire a small set of relations, since as the number of
possible discourse relations increases, the number of possible discourse trees to
be considered increases exponentially; the smaller the set of hypothesized
relations, the more quickly the algorithm for constructing RST trees (section
7.5) can test all possibilities.
RASTA resolves this tension by distinguishing two kinds of evidence.
The first kind of evidence is the set of necessary criteria—the conditions that
simply must be met before RASTA is even willing to “consider” a given
discourse relation. The second kind of evidence is the set of cues that are only
applied if the necessary criteria are satisfied. Coordination by means of the
conjunction and, for example, correlates with the SEQUENCE conjunction
(Figure 87, section 6.7.13), but only weakly. If we were to posit a SEQUENCE
relation every time we observed the conjunction and, we would posit a great
many spurious relations. However, RASTA only tests this cue if an extensive set
of necessary criteria for the SEQUENCE relation have been satisfied (Figure 82,
section 6.7.13).
6.6 Dependence on a set of relations
As noted in section 3.6, the thirteen relations employed in this study are
a relatively uncontroversial subset of the rhetorical relations that have been
proposed in the literature on discourse relations. They are also relations that can
be identified in text with a high degree of reliability. Should future research
motivate a different set of relations, an approach similar to the one described
here could no doubt be used to identify them. For example, syntactic analyses
could be examined to identify cues that correlate with the those relations. Cues
would correlate with the relations to varying degrees, so heuristic scoring
(section 6.4) would still be useful. Finally, the algorithm for constructing trees
on the basis of a set of hypothesized relations (chapter 7) is not dependent on
specific rhetorical relations, and so would not require any modification.
6.7 Cues to the relations
In this section I present the cues used in RASTA to identify rhetorical
relations in Encarta. RASTA examines all pairs of clauses from the total set of
RST terminal nodes. For each pair of clauses, RASTA tests the conditions in two
orders, i.e. for two clauses a and b, RASTA tests the cues with clause a as the
first clause (labeled Clause1 below) and clause b as the second clause (labeled
Clause2 below) and then with clause b as the first clause and clause a as the
second clause.
For the sake of brevity, in discussing some relations below I make
reference to the “Subordinate Clause Condition”. The Subordinate Clause
Condition is satisfied if the conditions given in Figure 20 are met.
1. Clause1 is a main clause.
2. If Clause2 is a subordinate clause then it must be subordinate to
Clause1.
Figure 20 The Subordinate Clause Condition
6.7.1 ASYMMETRICCONTRAST
The ASYMMETRICCONTRAST relation involves a contrast between two
constituents that are not of equivalent rhetorical status in the text. (Compare
this to the CONTRAST relation discussed in section 6.7.6, which consists of two
nuclei with equivalent rhetorical status in the text.) The following extended
excerpt from Encarta illustrates the different rhetorical statuses of the two
constituents of an ASYMMETRICCONTRAST relation. The Aardwolf article
discusses aardwolves at length, and in the final sentence (indicated in bold
type) contrasts their forefeet with those of hyenas. Mention of the anatomy of
hyenas in this context is however rhetorically subordinate to the main goal of
the passage, namely describing aardwolves.
“The aardwolf is classified as Proteles cristatus. It is usually
placed in the hyena family, Hyaenidae. Some experts, however,
place the aardwolf in a separate family, Protelidae, because of
certain anatomical differences between the aardwolf and the
hyena. For example, the aardwolf has five toes on its forefeet,
whereas the hyena has four.” (Aardwolf)
All instances of the ASYMMETRICCONTRAST relation observed in
Encarta involve the conjunction whereas. However, it is not the case that the
presence of the conjunction whereas in Encarta always correlates with this
relation, as the following excerpt illustrates:
“Only twisting is required to process filament fiber into yarn,
but staple fibers must be carded to combine the fibers into a
continuous ropelike form, combed to straighten the long fibers,
and drawn out into continuous strands, which are then twisted to
the desired degree. In general, the amount of twist given the
yarns determines various characteristics. Light twisting yields
soft-surfaced fabrics, whereas hard-twisted yarns produce
hard-surfaced fabrics, which provide resistance to
abrasion…” (Textiles)
Clearly, in this example, textiles, the topic under discussion, are not
being contrasted with something else. Rather, two different methods for
producing textiles are being contrasted. The final sentence of this excerpt could
be considered an ELABORATION of the sentence In general, the amount of twist
given the yarns determines various characteristics.
All instances of the ASYMMETRICCONTRAST relation observed in
Encarta to date share two characteristics. First, the subject of the nucleus of the
relation refers to the local discourse topic, i.e. the topic of the sub-section of the
Encarta article in which the nucleus occurs. Of course, an identification of a
local discourse topic presupposes an existing sophisticated discourse analysis.
One simple technique that has proven extremely effective in Encarta is to insist
that the head of the subject of the nucleus have the same base form as the head
of the title noun phrase of the section within which the nucleus occurs. In the
extended excerpt above, for example, the subject of the nucleus, aardwolf, has
the same base form as the title of the article within which the excerpt occurs.
Second, the satellite contains the conjunction whereas. Although the
conjunction whereas is present in all observed instances of the
ASYMMETRICCONTRAST relation, I do not include it in the necessary criteria.
Since it is likely that other cues will be discovered, as with the other relations
discussed below, I am reluctant to include the identification of any one
conjunction in the necessary criteria of a relation. Figure 21 gives the necessary
criteria for the ASYMMETRICCONTRAST relation.
1. Clause1 is syntactically subordinate to Clause2.
2. The head of the subject of Clause2 has the same base form as the
head of the title of the section within which Clause2 occurs.
Figure 21 Necessary criteria for the ASYMMETRICCONTRAST relation
If the necessary criteria given in Figure 21 are satisfied, RASTA tests cue
H20, given in Figure 22.
Cue Heuristic score
Cue name9
Clause1 contains the subordinating conjunction
whereas.
30 H20
Figure 22 Cue to the ASYMMETRICCONTRAST relation
9 Note that the names of the cues are arbitrary. By convention, the names start with
the letter H (for heuristic) and are followed by a number and optionally a letter. The attentive
reader might notice that some possible names (e.g. H19) do not occur in this dissertation. This
represents historical accident (an old cue with that label has been deemed unnecessary in the
system) rather than oversight.
The satellite of an ASYMMETRICCONTRAST relation may follow the
nucleus, as illustrated in Figure 23, or it may precede the nucleus, as illustrated
in Figure 24 and Figure 25.
1. Some experts, however, place the aardwolf in a separate
family, Protelidae, because of certain anatomical differences
between the aardwolf and the hyena.
2. For example, the aardwolf has five toes on its forefeet,
3. whereas the hyena has four.
Figure 23 Aardwolf
1. [W]hereas Fénelon supported quietism,
2. Bossuet considered it heresy.
Figure 24 Bossuet, Jacques Bénigne
1. Whereas pure neon gives a red light,
2. Argon tubes require a lower volt.
Figure 25 Argon
Clauses introduced by the conjunction whereas are invariably parsed as
syntactically subordinate by MEG. What then is to be done concerning the
Textiles excerpt above? The preferred analysis for the relevant section of this
excerpt is given in Figure 36.
1. In general, the amount of twist given the yarns determines
various characteristics.
2. Light twisting yields soft-surfaced fabrics,
3. whereas hard-twisted yarns produce hard-surfaced fabrics…
Figure 26 Textiles
Is a symmetric relation to be proposed between one clause and another clause,
where the latter is syntactically subordinate to the former, in violation of the
general principle outlined in section 6.2 that syntactically subordinate clauses
are always to be treated as rhetorically dependent on their matrix clauses? Since
the MEG system operates in a serial fashion, with morphology preceding
syntactic analysis and so on, it is occasionally necessary for a subsequent level
of processing that has access to additional resources to modify an earlier
analysis. As noted in chapter 5, for example, the first stage of syntactic analysis
defaults to a right-branching structure for prepositional phrase attachment, but
subsequent processing can revise the analysis based on reasoning with
MINDNET. Similarly, information available from discourse processing, namely
that the head of the subject is not the same as the head of the title of the section
in which the clause occurs, could be used to modify the syntactic analysis,
treating cases like the Textiles excerpt illustrated in Figure 26 as syntactically
coordinate constructions, analogous to the analysis of two clauses coordinated
by the CONTRAST conjunction like but.
6.7.2 CAUSE
RASTA distinguishes two related asymmetric relations: CAUSE and
RESULT. CAUSE relations are those in which a cause is expressed in the satellite,
and the result in the nucleus (see for example the definition of VOLITIONAL
CAUSE in section 3.4), whereas RESULT relations are those in which the result is
expressed in the satellite, and the cause in the nucleus. As noted above (section
3.6), I have collapsed the two relations VOLITIONAL CAUSE and NON-
VOLITIONAL CAUSE defined by Mann and Thompson (1988) into a single
relation CAUSE, and the two relations VOLITIONAL RESULT and NON-
VOLITIONAL RESULT defined by Mann and Thompson (1988) into a single
relation RESULT (section 6.7.12).
RASTA uses different cues for the CAUSE relation depending on whether
the Subordinate Clause Condition is satisfied. These cues are presented
separately below.
Criteria for the CAUSE relation when the Subordinate Clause
Condition is satisfied
The Subordinate Clause Condition defines the necessary criteria that
must be satisfied before the cues to the CAUSE relation given in Figure 27 are
tested. As noted in section 6.7.1, even if only a single cue has been identified to
date, I am reluctant to include it as one of the necessary criteria. It is likely that
additional research will uncover other cues to the CAUSE relation that ought to
be applied when the subordinate clause condition is satisfied.
Cue Heuristic score
Cue name
Clause2 is dominated by or contains a cue phrase
compatible with the CAUSE relation (because
due_to_the_fact_that since…)
25 H17
Figure 27 Cues to the CAUSE relation
The satellite of a CAUSE relation may precede or follow the nucleus.
Figure 28 illustrates a simple case. A CAUSE relation is hypothesized between
clauses 1 and 2 on the basis of the cue phrase due to the fact that, identified by
cue H17.
1. [D]ue to the fact that inflows from the Amu Darya have also
drastically diminished in recent decades,
2. the volume of the Aral Sea dropped by about 76 percent
between 1960 and 1995.
Figure 28 Syrdarya
The definition of cue H17 says that “Clause2 is dominated by or
contains a cue compatible with the CAUSE relation.” This circumlocution is
motivated by constraints on the well-formedness of RST trees. Figure 29
illustrates a case where two clauses are dominated by the conjunction because:
clause 2, …many women are aware, and clause 3, and concerned that…, are in
a JOINT relation. The women’s concerns are given as the cause of the increasing
popularity of natural childbirth. As described in section 7.5.1, RASTA will only
posit a CAUSE relation between the JOINT node and clause 1 if it can find
evidence of a CAUSE relation between Clause 1 and each of the co-nuclei of the
JOINT node. Thus RASTA must examine the syntactic and logical form analyses
produced by MEG to determine that the conjunction because has scope over
both clause 2 and clause 3. Clause 2 is syntactically dependent on clause 1; the
dominating conjunction because indicates the appropriate rhetorical label for
the relationship. Clause 3 is also syntactically dependent on clause 1, and is
dominated by the same conjunction, because.
1. Natural (unmedicated) childbirth, however, is becoming more
popular,
8. in part because many women are aware
9. and concerned that the anesthesia and medication given to
them is rapidly transported across the placenta to the unborn
baby.
Figure 29 Pregnancy and childbirth
Necessary criteria for the CAUSE relation when the Subordinate
Clause Condition is not satisfied
Figure 30 gives the necessary criteria for the CAUSE relation when the
Subordinate Clause Condition is not satisfied.
1. Clause1 precedes Clause2.
10.Clause1 is not syntactically subordinate to Clause2.
11.Clause2 is not syntactically subordinate to Clause1.
12.Either the syntactic subject or Dsub of Clause2 is a
demonstrative pronoun or is modified by a demonstrative; or
the Dsub of Clause1 and the Dsub of Clause2 are distinct
constituents (i.e. neither one is gapped) and the same lexical
item occurs as the head of the Dsub of Clause1 and the head
of the Dsub of Clause2 and that lexical item is not a pronoun;
or Clause1 and Clause2 are coordinated by a semi-colon.
Figure 30 Necessary criteria for the CAUSE relation when the Subordinate
Clause Condition is not satisfied
Criterion 1 is not to be interpreted as specifying that the relative
ordering of the nucleus and satellite is important. Rather, since RASTA
examines all possible pairs of terminal nodes (chapter 7), it is simpler to
formulate the conditions for the CAUSE relation to apply when Clause1 is the
nucleus and Clause2 is the satellite. For example, in the absence of this criterion
cue H29a in Figure 31 would have to say “Clause2 is passive and has the lexical
item cause as its head or Clause1 is passive and has the lexical item cause as its
head”. Clearly, this would be a rather verbose formulation. In cases where the
relative order of constituents really is important, this fact is stipulated (for
example, cue H22, a cue to the RESULT relation given in section 6.7.12 or
criterion 1 for the SEQUENCE relation, given in section 6.7.13). In all other
cases, the relative order of the nucleus and the satellite is assumed to be
unimportant.
Criteria 2 and 3 are merely intended to check that the clauses being
examined are not in a syntactic dependency relationship.
Criterion 4 is somewhat more complex. Any one of the conjuncts of this
criterion must be true before RASTA will consider additional cues to the CAUSE
relation when the Subordinate Clause condition is not satisfied. The first part of
criterion 4, “Either the syntactic subject or Dsub of Clause2 is a demonstrative
pronoun or is modified by a demonstrative” is intended merely to identify the
strong correlation between deixis and rhetorical structure (section 6.3). As
noted in section 6.5, the distinction between necessary criteria and cues,
although it correlates well with linguistic judgments, is primarily motivated by
considerations of computational efficiency. For the CAUSE relation, the
correlation between deixis and rhetorical structure is so strong as to warrant
inclusion as one part of the necessary criteria for the identification of the
relation.
The second part of criterion 4, “or the Dsub of Clause1 and the Dsub of
Clause2 are distinct constituents (i.e. neither one is gapped) and the same lexical
item occurs as the head of the Dsub of Clause1 and the head of the Dsub of
Clause2 and that lexical item is not a pronoun”, is intended to identify patterns
of referential continuity that correlate highly with asymmetric relations. When
two clauses whose Dsub nodes in the logical form (the abstraction of the
subject of an active sentence or the agent of a passive construction, section 5.4)
contain the same pronoun, it is likely that there is referential continuity. RASTA
could verify this by examining the evidence of the anaphora resolution
component of MEG that resolves anaphoric references for pronouns (section
5.4). However, this proves unnecessary. An examination of Encarta texts
suggests that the pattern in which the two Dsubs contain the same pronoun is
negatively correlated with asymmetric relations.
The final part of criterion 4, “or Clause1 and Clause2 are coordinated by
a semi-colon” identifies the final possible necessary criterion for the CAUSE
relation. Of course, other relations are compatible with clauses coordinated by a
semi-colon (for example, the LIST relation). However, at least one of the three
possibilities given in criterion 4 must be satisfied before RASTA should consider
a CAUSE relation.
Cues for the CAUSE relation when the Subordinate Clause
Condition is not satisfied
If the necessary criteria for the CAUSE relation are satisfied, RASTA tests
the two cues given in Figure 31.
Cue Heuristic score
Cue name
Clause2 is passive and has the lexical item cause as
its head.
10 H29a
The head of Clause2 contains the phrase result
from, with the verb possibly being inflected.
10 H29b
Figure 31 Cues to the CAUSE relation
Both cue H29a and cue H29b make reference to specific lexical items,
lexical items whose inherent semantics pertain to causality. In Figure 32, the
necessary criteria for the CAUSE relation are satisfied: neither clause is
syntactically dependent on the other (criteria 2 and 3) and the Dsub of both
clauses, isolation, is the same lexical item and is not a pronoun (criterion 3).
Cue H29b applies to identify the phrase “result from”.
1. In the third step, intrinsic isolation, some form of isolation
evolves among the populations.
2. Such isolation may result from preferences during courtship
or from genetic incompatibility.
Figure 32 Species and speciation
As the definition of cue H29b notes, the verb result may be inflected.
This is illustrated in Figure 33. Although it would not be too onerous to
enumerate the possible variants of the verb (result/s/ed/ing), the morphological
analysis performed by MEG during the course of parsing the input text makes it
unnecessary to enumerate all possibilities. MEG correctly identifies the base
form of resulted as result.
1. At the end of the 20th century, de facto segregation remained
a problem in many places in the United States.
2. De facto segregation has resulted from residential housing
patterns.
Figure 33 Segregation in the United States
Figure 34 illustrates another instance of the CAUSE relation. The
necessary criteria are satisfied (neither clause is syntactically dependent on the
other and the syntactic subject of Clause 2 is modified by the demonstrative
such) the cue H29a correctly identifies clause 2 as a passive clause whose head
is the verb cause.
1. The mechanical loss of fertile topsoil is one of the gravest
problems of agriculture.
2. Such loss is almost always caused by erosion resulting from
the action of water or wind.
Figure 34 Soil management
6.7.3 CIRCUMSTANCE
The CIRCUMSTANCE relation is an asymmetric relation. Mann and
Thompson define a CIRCUMSTANCE relation as one in which the satellite “sets a
framework in the subject matter within which [the reader] is intended to
interpret the situation presented in [the nucleus]” (Mann and Thompson
1988:272). To date, the CIRCUMSTANCE relation has only been encountered in
Encarta within a single sentence. If the Subordinate Clause Condition is
satisfied, then the cues given in Figure 35 are tested.
Cue Heuristic score
Cue name
Clause2 is dominated by or contains a
circumstance conjunction (after before while…).
20 H12
Clause2 is a detached –ing participial clause and
the head of Clause2 precedes the head of Clause1.
5 H13
Figure 35 Cues to the CIRCUMSTANCE relation
Figure 36 and Figure 37 illustrate the application of cue H12. In Figure
36, clause 1 is in a CIRCUMSTANCE relation to both clause 2 and clause 3.
Clause 1 can therefore be said to be in a CIRCUMSTANCE relation to the text
span covering clause 2 and clause 3.
1. After the revolt was crushed,
2. Solomon stripped Abiathar of priestly office
3. and banished him from Jerusalem.
Figure 36 Abiathar
In Figure 37, clause 2 follows the main clause. The Subordinate Clause
Condition is still satisfied, however, and cue H12 still detects the conjunction
after and identifies the CIRCUMSTANCE relation.
1. In April 1994 fighting erupted between Rwanda's two main
ethnic groups, the Hutu and Tutsi,
2. after the presidents of both Rwanda and Burundi were killed
in a suspicious plane crash.
Figure 37 Africa
Figure 38 illustrates the application of heuristic H13: clause 1 is a
detached preposed –ing participial clause.
1. Leaving port on October 19 and 20,
2. Villeneuve’s fleet was intercepted by Nelson’s fleet on the
morning of October 21.
Figure 38 Trafalgar, Battle of
When heuristic H13 applies in Encarta, it is almost always the case that
heuristic H12 also applies. Figure 39 illustrates a case in which cues H12 and
H13 both apply: clause 1 is both a preposed detached –ing participial clause
and contains the conjunction while.
1. While recuperating from sunstroke, which put an end to a
potential baseball career with the New York Yankees,
2. Acuff developed a serious interest in music.
Figure 39 Acuff, Roy
Cue H13 specifies that “Clause2 is a detached –ing participial clause and
the head of Clause2 precedes the head of Clause1.” Detached –ing participial
clauses in which the head follows the head of the matrix clause tend be in a
RESULT relation in Encarta, rather than a CIRCUMSTANCE relation. Cue H22 in
section 6.7.12 identifies a RESULT relation in such cases.
6.7.4 CONCESSION
The CONCESSION relation is an asymmetric relation. The writer
acknowledges an apparent incompatibility between the situations presented in
the nucleus and the satellite, but regards the situations as compatible. Mann and
Thompson (1988:254-5) note that “recognizing the compatibility between the
situations presented in [the nucleus] and [the satellite] increases [the reader’s]
positive regard for the situation presented in [the nucleus].”
To date, the CONCESSION relation has only been identified in Encarta
within single sentences. If the Subordinate Clause Condition is satisfied, then
the cue given in Figure 40 is tested.
Cue Heuristic score
Cue name
Clause2 contains a CONCESSION conjunction
(although even_though without…)
10 H11
Figure 40 Cue to the CONCESSION relation
Figure 41 illustrates the application of cue H11 to identify the
subordinate clause without ever having been exposed to the Italian High
Renaissance as the satellite in a CONCESSION relation.
1. Grünewald seems to have achieved a form of Mannerism
2. without ever having been exposed to the Italian High
Renaissance.
Figure 41 Renaissance Art and Literature
In Encarta, the satellite in a CONCESSION relation is frequently an
elliptical clause, as illustrated in Figure 42 and Figure 43. The meaning of the
CONCESSION relation is clearly illustrated in Figure 42. Ordinarily, being timid
(clause 1) and being prepared to fight (clause 2) might be considered by a
reader to be incompatible situations. The writer, aware perhaps of this apparent
incompatibility, nonetheless wishes to present as fact the aardvark’s
preparedness to fight (clause 2) in some circumstances, namely when it cannot
flee (clause 3).
1. Although timid,
13. the aardvark will fight
14.when it cannot flee.
Figure 42 Aardvark
1. Although organized into regional and central groups
2. each church governs itself independently.
Figure 43 Adventists
The satellite of a CONCESSION relation is frequently an elliptical clause,
but can also be a full clause, as illustrated in Figure 44.
1. Although the Quakers had long opposed slavery,
15.abolitionism as an organized force began in England in the
1780s…
Figure 44 Abolitionists
Encarta also contains a great many prepositional phrases introduced by
the preposition despite. These phrases are not clauses with “independent
functional integrity” (Mann and Thompson 1988:248), the essential criterion
for identifying terminal nodes for an RST tree (section 3.2). In contrast to an
approach based on superficial pattern-matching (section 4.2.6), RASTA is able
to examine a complex syntactic analysis in order to correctly identify these
prepositional phrases and so does not treat them as terminal nodes in an RST
tree. The following two excerpts illustrate the use of despite to introduce
prepositional phrases.
Mobile for a while became an important shipbuilding city
despite the shallowness of Mobile Bay. (Alabama)
About 75 percent of eligible voters participated, despite
threats from the outlawed Islamic Salvation Front to
kill anyone who voted. (Algeria)
The bold sections of these two examples might be considered to entail
propositions of a kind that ought to be modeled in RST. The phrase despite the
shallowness of Mobile Bay entails the proposition “Mobile Bay is shallow” and
the phrase despite threats from the outlawed Islamic Salvation Front to kill
anyone who voted entails the proposition “the outlawed Islamic Salvation Front
threatened to kill anyone who voted.” Is it appropriate to insist that a terminal
node in an RST analysis ought to correspond to a clause with “independent
functional integrity” (Mann and Thompson 1988:248; see section 3.2)? As
noted in chapter 1, the development of RASTA has been guided by a
functionalist perspective on language: writers manipulate linguistic form to
achieve their communicative objectives. We ought therefore to attach
significance to the fact that the writers of these two excerpts chose to use a
phrasal formulation rather than to express the same propositional content by
means of clauses with independent functional integrity. Alternatively, we could
regard the fact that RASTA does not model such phrases as nothing more than a
restriction on the granularity of the analysis performed.
6.7.5 CONDITION
The CONDITION relation is an asymmetric relation. Mann and
Thompson (1988:276) note that in a CONDITION relation “realization of the
situation presented in [the nucleus] depends on realization of that presented in
[the satellite].” To date, the CONDITION relation has only been identified in
Encarta within a sentence. If the Subordinate Clause Condition is satisfied,
then the cue given in Figure 45 is tested.
Cue Heuristic score
Cue name
Clause2 contains a condition conjunction
(as_long_as if unless…)
10 H21
Figure 45 Cue to the CONDITION relation
Figure 46 illustrates the CONDITION relation identified by the presence
of the subordinating conjunctive phrase as long as.
1. The premier and cabinet remain in power
16.as long as they have the support of a majority in the provincial
legislature.
Figure 46 Prince Edward Island
As noted earlier (section 4.2.6), the expression as long as should
sometimes be treated as a phrase with internal syntactic structure and at other
times as a single unit, a phrase that functions as a subordinating conjunction. As
Figure 47 shows, the syntactic analysis performed by MEG correctly identifies
the expression as long as as a subordinating conjunction in this example.
Figure 47 Prince Edward Island: Syntactic analysis10
Figure 48 illustrates the CONDITION relation identified by the presence
of the subordinating conjunction if, by far the most common manner in which
the CONDITION relation is signaled in Encarta.
10 The attachment of PP5, in the provincial legislature, is ambiguous, but is not
relevant to the determination of the discourse relation between the two clauses.
1. If not promptly treated by surgical means,
2. ectopic pregnancy can result in massive internal bleeding and
possibly death.
Figure 48 Pregnancy and Childbirth
6.7.6 CONTRAST
The CONTRAST relation is a symmetric relation. Mann and Thompson
(1988:278) note that the situations presented in the nuclei are “(a)
comprehended as the same in many respects (b) comprehended as differing in a
few respects and (c) compared with respect to one or more of these
differences.” RASTA distinguishes two different relations that satisfy these
criteria. In the ASYMMETRICCONTRAST relation (section 6.7.1), two
propositions are contrasted, but the propositions are not of equal rhetorical
status in the text. Rather, the proposition contained in the nucleus is more
central in realizing the writer’s goals. In the CONTRAST relation, the
propositions that are related are of equal rhetorical status.
RASTA employs two distinct sets of criteria in identifying the CONTRAST
relation. The first set involves a work-around (mentioned in section 6.7.1) to
compensate for the fact that during a sentence-by-sentence syntactic analysis of
a text, MEG is not able to decide whether a clause introduced by the
conjunction whereas ought to be analyzed as syntactically subordinate or
coordinate. Figure 49 gives the necessary criteria for this work-around.
1. Clause1 is syntactically subordinate to Clause2.
2. The head of the syntactic subject of Clause2 does not have the same
base form as the head of the title of the section within which Clause2
occurs.
Figure 49 Necessary criteria for the CONTRAST relation work-around
If the necessary criteria given in Figure 49 are satisfied, RASTA tests cue
H42, given in Figure 50. (As noted in 6.7.1, it is not desirable to include
specific conjunctions in the set of necessary criteria for the identification of a
relation.) The necessary criteria given in Figure 49, combined with cue H20,
are intended to correctly identify a CONTRAST relation, as illustrated in Figure
51. Examples of the ASYMMETRICCONTRAST relation which these necessary
criteria together with cue H42 correctly do not select are given in section 6.7.1.
Cue Heuristic score
Cue name
Clause1 contains the subordinating conjunction
whereas.
30 H20
Figure 50 Cue to the CONTRAST relation work-around
1. In general, the amount of twist given the yarns determines
various characteristics.
2. Light twisting yields soft-surfaced fabrics,
3. whereas hard-twisted yarns produce hard-surfaced fabrics…
Figure 51 Textiles
Having dealt with the special case of a work-around for a misparse, let
us now consider the more general criteria and cues for the CONTRAST relation.
Figure 52 gives the necessary criteria for the CONTRAST relation.
1. Clause1 precedes Clause2.
2. Clause1 is not syntactically subordinate to Clause2.
3. Clause2 is not syntactically subordinate to Clause1.
4. The subject of Clause2 is not a demonstrative pronoun, nor is it
modified by a demonstrative.
Figure 52 Necessary criteria for the CONTRAST relation
The criteria given in Figure 52 do not hold any surprises. With the
exception of the cases dealt with by the work-around, we would not expect the
co-nuclei of the CONTRAST to be in a relationship involving syntactic
dependency (criteria 2 and 3), in line with the observations about the
correlation between clausal status and rhetorical status (section 6.2). Similarly,
criterion 4, “The subject of Clause2 is not a demonstrative pronoun, nor is it
modified by a demonstrative” is intended to exclude cases in which an
asymmetric relation is more plausible, given the interplay of deixis and
rhetorical structure discussed in section 6.3. (More general considerations of
anaphoric reference and referential continuity have not proven necessary to
distinguish the CONTRAST relation from other relations.) If the necessary
criteria given in Figure 52 for the CONTRAST relation are satisfied, then the
cues given in Figure 53 are tested.
Cue Heuristic score
Cue name
Clause2 is dominated by or contains a CONTRAST
conjunction (but however…). If Clause2 is in a
coordinate structure, then it must be coordinated
with Clause1.
25 H4
Cue H4 is satisfied and the head verbs of Clause1
and Clause2 have the same base form.
10 H39
Clause1 and Clause2 differ in polarity (i.e. one
clause is positive and the other negative).
5 H5
The syntactic subject of Clause1 is the pronoun
some or has the modifier some and the subject of
Clause2 is the pronoun other or has the modifier
other.
30 H6
Figure 53 Cues for the CONTRAST relation
Figure 54 illustrates a simple example of the CONTRAST relation. RASTA
correctly posited a CONTRAST relation on the basis of cues H4 (identifying the
conjunction but) and H5 (identifying the different polarity of the two clauses).
1. An abbess has administrative jurisdiction equivalent to that of
the abbot of a monastery
17.but does not exercise the rights and duties of the priesthood.
Figure 54 Abbess
The circumlocution “is dominated by or contains a CONTRAST
conjunction” in the definition of cue H4 is motivated by constraints on the
well-formedness of RST trees. Figure 55 illustrates a case where two clauses are
dominated by the conjunction but: clause 2, she became involved with a dance
group, and clause 3, and, after rapid progress, won a scholarship with the New
Dance Group, are in a SEQUENCE relation. This turn of events is contrasted
with Primus Pearl’s intention in clause 1 to become a doctor. As described in
section 7.5.1, RASTA will only posit a CONTRAST relation between the
SEQUENCE node and clause 1 if it can find evidence of a CONTRAST relation
between Clause 1 and each of the co-nuclei of the SEQUENCE node.
1. Primus planned to become a doctor,
18.but she became involved with a dance group
19.and, after rapid progress, won a scholarship with the New
Dance Group.
Figure 55 Primus, Pearl
Figure 56 gives RASTA’s analysis of another text involving the
CONTRAST relation. For this text, RASTA posited a contrast relation between
clauses 2 and 3 on the basis of cues H4 (the conjunction however is identified)
and H39 (the base forms of the main verb in each clause is place). It is
interesting as a human analyst to note several pieces of evidence in this text
which further lend support to this analysis but which were not examined by
RASTA. For example, the adverb usually in line two primes a comparison
between a usual analysis and an unusual one. Similarly, the repetition of the
word family and the occurrence of the affix -idae in the Linnean taxonomic
labels Hyaenidae and Protelidae confirm that the CONTRAST relation signaled
by the conjunction however holds between clause 2 and clause 3, and not
between clause 1 and clause 3. Although these additional pieces of evidence are
compelling, they were not necessary for RASTA to correctly identify the relation
here. Most encouraging is the fact that we do not have to encode an
understanding of the Linnaen taxonomic system of classification in order to
process such texts.
1. The aardwolf is classified as Proteles cristatus.
2. It is usually placed in the hyena family, Hyaenidae.
3. Some experts, however, place the aardwolf in a separate
family, Protelidae, because of certain anatomical differences
between the aardwolf and the hyena.
4. For example, the aardwolf has five toes on its forefeet,
5. whereas the hyena has four.
Figure 56 Aardwolf
The some…other construction identified by cue H6 is extremely
common in Encarta. Figure 57 illustrates the analysis of one example of this
construction.
1. Abrasives are usually very hard substances.
20.Some are used in the form of fine powders;
21.others break in such a way as to form sharp cutting edges
22.and are used in larger pieces.
Figure 57 Abrasives
6.7.7 ELABORATION
In an ELABORATION relation, an asymmetric relation, the satellite
provides additional information for the situation presented in the nucleus
(Mann and Thompson 1988:273). The ELABORATION relation is pervasive in
Encarta. A particularly common discourse structure in Encarta occurs in the
first paragraph of an article. In this common structure, the first sentence of the
first paragraph defines the title noun phrase. The remainder of the first
paragraph consists of one or more text spans in an ELABORATION relation to the
first sentence. This pattern is illustrated in Figure 57 (section 6.7.6).
Interestingly, the ELABORATION relation has only been encountered
between main clauses. No instances of an ELABORATION relation have been
encountered between a main clause and a clause that it is syntactically
dependent on that main clause. Figure 58 gives the necessary criteria for the
ELABORATION relation.
1. Clause1 precedes Clause2.
23.Clause1 is not subordinate to Clause2.
24.Clause2 is not subordinate to Clause1.
Figure 58 Necessary criteria for the ELABORATION relation
If the necessary criteria given in Figure 58 are satisfied, then the cues
given in Figure 59 are tested.
Cue Heuristic score
Cue name
Clause1 is the main clause of a sentence (sentencei) and
Clause2 is the main clause of a sentence (sentencej) and
sentencei immediately precedes sentencej and (a)
Clause2 contains an elaboration conjunction (also
for_example) or (b) Clause2 is in a coordinate structure
whose parent contains an elaboration conjunction.
35 H24
Cue H24 applies and Clause1 is the main clause of the
first sentence in the excerpt.
15 H26
Clause2 contains a predicate nominal whose head is in
the set {portion component member type kind example
instance} or Clause2 contains a predicate whose head
verb is in the set {include consist}.
35 H41
Clause1 and Clause2 are not coordinated and (a)
Clause1 and Clause2 exhibit subject continuity or (b)
Clause1 is passive and the head of the Dobj of Clause1
and the head of the Dobj of Clause2 have the same base
form or (c) Clause2 contains an elaboration
conjunction.
10 H25
Cue H25 applies and Clause2 contains a habitual
adverb (sometimes usually…)
17 H25a
Cue H25 applies and the syntactic subject of Clause2 is 10 H38
the pronoun some or contains the modifier some.
Figure 59 Cues to the ELABORATION relation
Figure 60 illustrates the application of several cues to the ELABORATION
relation. An ELABORATION relation is posited between Clause1 and Clause2 on
the basis of subject continuity (cue H25) and the habitual adverb usually (cue
H25a). In considering the relationship between Clause1 and Clause3, RASTA
observed that (a) Clause1 is passive and that the base form of the Dobj of
Clause1 is the same as the base form of the Dobj of Clause2 (cue H25) and (b)
the subject of Clause2 has the modifier some (cue H38)11. Finally, RASTA
identifies an ELABORATION relation between Clause3 and Clause4 because
Clause4 contains the connective for example and immediately follows Clause3
(cue H24).
11 This cue is really intended to identify instances in which the noun phrase
containing the word some denotes the same class of entities as the antecedent of that noun
phrase. The application of cue H38 in this case should therefore be considered serendipitous.
1. The aardwolf is classified as Proteles cristatus.
25. It is usually placed in the hyena family, Hyaenidae.
26.Some experts, however, place the aardwolf in a separate
family, Protelidae, because of certain anatomical differences
between the aardwolf and the hyena
27.For example, the aardwolf has five toes on its forefeet…
Figure 60 Aardwolf
Figure 61 illustrates the application of cue H41. The ELABORATION
relation between Clause1 and Clause2 is identified by means of the head verb
include in Clause2. Similarly, the predicate nominal a portion of an
underground stem in Clause3 acts as a cue to the ELABORATION relation
between Clause2 and Clause3.
1. A stem is a portion of a plant.
28.Subterranean stems include the rhizomes of the iris and the
runners of the strawberry;
29. the potato is a portion of an underground stem.
Figure 61 Stem
6.7.8 JOINT
The JOINT relation is a symmetric relation. It is posited when a
symmetric relation seems to hold between two clauses and no other symmetric
relations seem plausible. If all the conditions given in Figure 62 are satisfied,
then RASTA posits the JOINT relation, giving it a heuristic score of 5. Since the
JOINT relation is a default relation, there are no additional cues that are tested if
the necessary criteria apply.
1. RASTA cannot identify any other symmetric relation between
Clause1 and Clause2.
2. Clause1 precedes Clause2.
3. Clause1 is not subordinate to Clause2.
4. Clause2 is not subordinate to Clause1.
5. Clause1 and Clause2 are the same kind of constituent
(declarative, interrogative, etc).
6. The subject of Clause2 is not a demonstrative pronoun, nor is
it modified by a demonstrative.
7. If Clause1 has a pronominal subject then Clause2 must also
have a pronominal subject.
Figure 62 Necessary criteria for the JOINT relation
To satisfy criterion 1, “RASTA cannot identify any other symmetric
relation between Clause1 and Clause2”, RASTA checks for the joint relation only
after it has tested the other symmetric relations (CONTRAST, LIST and
SEQUENCE).
Criteria 3 and 4 use the correlations between syntactic structure and
rhetorical structure (section 6.2) to determine whether a symmetric relation is
possible. Similarly, criteria 6 and 7 use patterns of anaphora, deixis and
referential continuity (section 6.3) to further ensure that a symmetric relation is
possible.
If the two clauses being considered are not of the same clause type
(Criterion 5), then it is likely that a more contentful relation exists than the
JOINT relation. Figure 63 illustrates the a case in which a JOINT relation ought
not to be hypothesized. The alternation of a declarative clause (clause 1) and an
interrogative clause (clause 2) here suggests an asymmetric relation. Clause 2
specifies one of the many important questions that arose concerning religion.
Since it specifies additional detail, clause 2 is most likely in an ELABORATION
relation to clause 1.
1. Many other important questions about the nature of religion
were addressed during this period:
2. Can religion be divided into so-called primitive and higher
types?…
Figure 63 Religion
Figure 29 in section 6.7.2, reproduced below as Figure 64, illustrates the
JOINT relation between clauses 2 and 3. No other relation seems plausible
between clauses 2 and 3, so criterion 1 for the JOINT relation is satisfied.
Neither clause is subordinate to the other (criteria 3 and 4); both clauses are of
the same type, namely declarative (criterion 5); the subject of clause 3 is not a
demonstrative, nor is it modified by a demonstrative (criterion 6); clause 2 does
not have a pronominal subject, so criterion 7 is satisfied.
1. Natural (unmedicated) childbirth, however, is becoming more
popular,
30. in part because many women are aware
31.and concerned that the anesthesia and medication given to
them is rapidly transported across the placenta to the unborn
baby.
Figure 64 Pregnancy and childbirth
6.7.9 LIST
The LIST relation is a symmetric relation. Often, clauses in a LIST
relation are also amenable to a SEQUENCE interpretation. In general, a
SEQUENCE relation ought to be preferred over a list relation if there is any
evidence that the author might have preferred a SEQUENCE interpretation. For
example, explicit indication of temporal sequencing prefers a SEQUENCE
relation. Figure 65 gives the necessary criteria for the LIST relation.
1. Clause1 precedes Clause2.
32.Clause1 is not syntactically subordinate to Clause2.
33.Clause2 is not syntactically subordinate to Clause1.
34.The subject of Clause2 is not a demonstrative pronoun, nor is
it modified by a demonstrative.
35.Clause1 and Clause2 agree in polarity.
36.There is not alternation where the syntactic subject of Clause1
is the pronoun some or has the modifier some and the subject
of Clause2 is the pronoun other or has the modifier other.
37. If the syntactic subject of Clause2 is a pronoun, then the
syntactic subject of Clause1 must be the same pronoun.
38.Clause2 is not dominated by and does not contain
conjunctions compatible with the CONTRAST, ASYMMETRIC-
CONTRAST or ELABORATION relations.
Figure 65 Necessary criteria for the LIST relation
Criteria 2 and 3 use the correlations between syntactic structure and
rhetorical structure (section 6.2) to determine whether a symmetric relation is
likely. Similarly, criteria 6 and 7 use patterns of anaphora, deixis and
referential continuity (section 6.3) to further ensure that a symmetric relation is
possible. Criterion 5, “Clause1 and Clause2 agree in polarity”, is intended to
distinguish the LIST relation from the CONTRAST relation (section 6.7.6).
Similarly, criterion 8, “Clause2 is not dominated by and does not contain
conjunctions compatible with the CONTRAST, ASYMMETRICCONTRAST or
ELABORATION relations” is intended to distinguish the LIST relation from other
relations.
If the necessary criteria given in Figure 65 are satisfied, the additional
cues given in Figure 66 are tested.
Cue Heuristic score
Cue name
Clause1 and Clause2 both contain enumeration
conjunctions (first second third…)
15 H7
Clause1 is passive or contains an attributive
predicate and Clause2 is passive or contains an
attributive predicate.
10 H8
Clause2 is in a coordinate construction and the
coordinating conjunction is a LIST conjunction
(also and…)
10 H9
Clause1 and Clause2 both contain a Dobj and the
heads of those Dobjs have the same base form.
5 H10
Figure 66 Cues to the LIST relation
Figure 67 illustrates the application of cues H7 and H8 in the correct
identification of a LIST relationship. Cue H7 identifies the enumeration
conjunctions first and second. Cue H8 identifies the passive voice of both
clauses.
1. Psychotherapy differs in two ways from the informal help one
person gives another.
2. First, it is conducted by a psychotherapist who is specially
trained and licensed or otherwise culturally sanctioned.
3. Second, psychotherapy is guided by theories about the
sources of distress and the methods needed to alleviate it.
Figure 67 Psychotherapy
Cue H8 identifies sequences of clauses that are either passive or have
attributive predicates. Attributive predicates are identified using relatively
simple criteria: the main verb is be or the main verb is have with a direct object
that denotes an attribute, e.g. a body part. Figure 68 illustrates a sequence of
clauses typical of the description of animals in Encarta. The first three clauses
in Figure 68 contain attributive predicates: be plus a length and have plus body
parts. Clause 5 is parsed by MEG as a passive sentence, and identified by
RASTA as another nucleus in a LIST relation with clauses 1 through 3. Figure 68
thus demonstrates that RASTA is able to correctly construct discourse
representations, even given an occasional misparse by MEG.12 Were clause 5
parsed as containing a main verb be followed by a past participle, cue H8
would still succeed in identifying a LIST relation.
1. The short-nosed echidna found in Australia is about 35 to 53
cm long …,
39.and has a broad body mounted upon short, strong legs.
40.The legs have powerful claws,
41.adapting the animal for rapid digging into hard ground.
42.The back is covered with stiff spines…
Figure 68 Echidna
12 The goal of ongoing development of MEG is of course to eliminate bad parses like
these.
6.7.10 MEANS
The MEANS relation is an asymmetric relation, in which the satellite
presents the means by which the situation in the nucleus has come about. To
date, the MEANS relation has only been identified in Encarta within single
sentences. If the Subordinate Clause Condition is satisfied, then the cue given
in Figure 69 is tested.
Cue Heuristic score
Cue name
Clause2 contains a MEANS conjunction (by…) 20 H44
Figure 69 Cue to the MEANS relation
Figure 70 illustrates the application of cue H44 to identify the MEANS
relation.
1. Various residential complexes of clay and stone were built
2. by piling rooms and terraces onto one another.
Figure 70 Pre-Columbian Art and Architecture
6.7.11 PURPOSE
The PURPOSE relation is an asymmetric relation. Mann and Thompson
(1988:276) note that “[the reader] recognizes that the activity in [the nucleus] is
initiated in order to realize [the satellite].” If the Subordinate Clause Condition
is satisfied, then the cues given in Figure 71 are tested.
Cue Heuristic score
Cue name
Clause2 is an infinitival clause. 5 H15
Clause2 or one of the ancestors of Clause2 contains
a purpose conjunction (in_order_to so_that).
10 H16
Figure 71 Cues to the PURPOSE relation
Figure 72 illustrates a nesting of PURPOSE relations. Cue H15 identifies
the PURPOSE relation between Clause1 and Clause2 on the basis of the infinitival
clause to learn the language. Cues H15 and H16 both identify the PURPOSE
relation between Clause2 and Clause3 on the basis of the infinitival clause in
order to translate fairytales and the connective in order to.
1. Ransome left alone for Russia in 1913
43. to learn the language
44. in order to translate fairytales.
Figure 72 Ransome, Arthur Michell
6.7.12 RESULT
As noted in section 6.7.2, RST distinguishes between CAUSE and
RESULT relations. In a RESULT relation, the result is expressed in the satellite,
and the cause in the nucleus. As noted above (section 3.6), I have collapsed the
two relations VOLITIONAL RESULT and NON-VOLITIONAL RESULT originally
proposed by Mann and Thompson (1988) into a single relation, RESULT.
RST does not impose ordering constraints on the constituents of an
asymmetric relation (Mann and Thompson 1988:248). In Encarta, however, the
Satellite in a RESULT relation always follows the nucleus.
RASTA uses different cues for the RESULT relation depending on
whether the Subordinate Clause Condition is satisfied. These cues are presented
separately below.
Criteria for the RESULT relation when the Subordinate Clause
Condition is satisfied
The Subordinate Clause Condition defines the necessary criteria that
must be satisfied before the cues to the RESULT relation given in Figure 73 are
tested.
Cue Heuristic score
Cue name
The head of Clause2 follows the head of Clause1;
and Clause2 is a detached –ing participial clause;
and if Clause2 is subordinate to a NP, then the
parent of that NP must be Clause1.
15 H22
Clause2 follows Clause1 and Clause2 contains a
result conjunction (as_a_result consequently so…)
35 H23
Figure 73 Cues to the RESULT relation
The last part of the conditions of cue H22, “if Clause2 is subordinate to
a NP then the parent of that NP must be Clause1” is intended to resolve a
common misparse in which a detached participial clause is incorrectly
subordinated to a NP, as illustrated in Figure 74.
Figure 74 Misparse of a detached participial clause
As Figure 75 shows, the use of cue H22 enables RASTA to hypothesize a
plausible RESULT relation for the misparsed excerpt in Figure 74.
1. This bold strategy gave them an advantage,
2. creating confusion.
Figure 75 Waterloo, Battle of
Figure 76 illustrates the application of cue H23. The phrase as a result
is correctly identified as a cue to the RESULT relation.
1. Ramsey used two separate magnetic fields;
2. as a result, he achieved vastly increased accuracy in the
measurements.
Figure 76 Ramsey, Norman Foster
In Figure 77, the phrase as a consequence is correctly identified by cue
H23 as a cue to the RESULT relation.
1. Islam arose as a powerful reaction against the ancient pagan
cults of Arabia,
45.and as a consequence it is the most starkly monotheistic of
the three biblically rooted religions.
Figure 77 God
Necessary criteria for the RESULT relation when the Subordinate
Clause Condition is not satisfied
Figure 78 gives the necessary criteria for the CAUSE relation when the
Subordinate Clause Condition is not satisfied.
1. Clause1 precedes Clause2.
46.Clause1 is not syntactically subordinate to Clause2.
47.Clause2 is not syntactically subordinate to Clause1.
48.Either the subject or Dsub of Clause2 is a demonstrative
pronoun or is modified by a demonstrative; or the Dsub of
Clause1 and the Dsub of Clause2 are distinct constituents (i.e.
neither one is gapped) and the same lexical item occurs as
the head of the Dsub of Clause1 and the Dsub of Clause2 and
that lexical item is not a pronoun; or Clause1 and Clause2 are
coordinated by a semi-colon.
Figure 78 Necessary criteria for the RESULT relation when the Subordinate
Clause Condition is not satisfied
Criteria 2 and 3 are intended to isolate main clauses. Criterion 4
requires some indication that an asymmetric relation is motivated: either
coordination by means of the semi-colon, or patterns of anaphora, deixis and
referential continuity that correlate strongly with asymmetric relations (section
6.3).
Cues for the RESULT relation when the Subordinate Clause
Condition is not satisfied
If the criteria given in Figure 78 are satisfied and neither an
ELABORATION relation (section 6.7.7) nor a CAUSE relation (section 6.7.2) has
been identified, then it is reasonable to posit a RESULT relation. The RESULT
relation is given an initial score of 5, and the cues given in Figure 79 are tested.
Cue Heuristic score
Cue name
Clause2 contains a result conjunction
(consequently…)
10 H32
Clause2 contains the phrase result in, with the verb
possibly being inflected.
10 H33
Clause2 is not passive, and the predicate of Clause2
has as its head a verb that entails a result (cause
make…)
5 H34
Figure 79 Cues for the RESULT relation when the Subordinate Clause
Condition is not satisfied
The phrase result in, identified in its various inflected forms by cue
H33, is extremely common in Encarta. Figure 80 illustrates one example.
1. The most frequent cause, however, is chronic abuse of the
vocal apparatus, either by overuse or by improper production
of the voice;
2. this may result in such pathological changes as growths on or
thickening and swelling of the vocal cords.
Figure 80 Speech and Speech Disorders
For cue H34, the requirement that the clause not be passive ensures that
the RESULT relation is correctly distinguished from the CAUSE relation
identified by cue H29a (section 6.7.2). Figure 81 illustrates the application of
cue H34.
1. Propane forms a solid hydrate at low temperatures,
2. and this causes great inconvenience
3. when a blockage occurs in a natural-gas line.
Figure 81 Propane
6.7.13 SEQUENCE
As Mann and Thompson (1987:74) note, the SEQUENCE relation is
unique among RST relations in that the order of its constituents is important.
Mann and Thompson also note that
“Temporal succession is not the only type of succession for
which the Sequence relation might be appropriate. Others could
include descriptions of a group of cars according to size or cost,
colors of the rainbow, who lives in rows of apartments, etc.”
(1987:74)
In Encarta, however, all instances of the SEQUENCE relation encountered to
date have involved temporal succession. The SEQUENCE relation is used in
Encarta to express a narrative sequence of events. It is therefore not surprising
that many of the criteria proposed below for identifying the SEQUENCE relation
resemble those proposed in the linguistics literature for identifying narrative
clauses (for example, Labov 1972, Reinhart 1984).
Necessary criteria for the SEQUENCE relation
Figure 82 gives the necessary criteria for the SEQUENCE relation.
1. Clause1 precedes Clause2.
49.Clause1 is not syntactically subordinate to Clause2.
50.Clause2 is not syntactically subordinate to Clause1.
51.The subject of Clause2 is not a demonstrative pronoun, nor is
it modified by a demonstrative.
52.Neither Clause1 nor Clause2 has progressive aspect (marked
by the -ing verbal suffix).
53. If either Clause1 nor Clause2 has negative polarity, then it
must also have an explicit indication of time.
54.Neither Clause1 nor Clause2 is a Wh question.
55.Neither Clause1 nor Clause2 has an attributive predicate.
56.The event expressed in Clause2 does not temporally precede
the event in Clause1; nor does the event expressed in Clause2
occur within the time span covered by the event expressed in
Clause1.
57.Clause1 and Clause2 match in tense and aspect.
58.Clause2 must not be immediately governed by a contrast
conjunction.
Figure 82 Necessary criteria for the SEQUENCE relation
If the necessary criteria given in Figure 82 are satisfied, it is reasonable
to posit a SEQUENCE relation between two clauses. The necessary criteria are
sufficiently stringent that an initial heuristic score of 20 is associated with this
hypothesized relation. A few of the necessary criteria for the sequence relation
merit special discussion.
Criteria 2 and 3 are intended to bar situations in which one clause is
syntactically dependent on another.
Criterion 51, “The subject of Clause2 is not a demonstrative pronoun,
nor is it modified by a demonstrative”, is intended to block cases in which the
correlations deixis and discourse structure (section 6.3) would make an
asymmetric relation more likely than the symmetric SEQUENCE relation. For
example, in the following excerpt, a SEQUENCE relation is dispreferred in the
face of a more plausible RESULT relation.
He made a study of the famous Adams family of
Massachusetts, to which he was not related; this study
resulted in “The Adams Family”… (Adams, James
Truslow).
As noted above (this section), the SEQUENCE relation is used in Encarta
to express a narrative sequence of events. Criterion 5, “Neither Clause1 nor
Clause2 has progressive aspect (marked by the –ing verbal suffix)”, is intended
to preclude clauses which are not eventive, as in the following example:
Abbott was willing to admit a number of manufactured
goods from the United States duty-free. (Abbott, Sir John
Joseph Caldwell)
For the most part, clauses with negative polarity do not express events
and therefore cannot enter into the SEQUENCE relation. One notable exception
to this generalization is clauses with negative polarity which also contain an
explicit indication of time (Criterion 6), as illustrated in Figure 84 and Figure
85. (The sentences with negative polarity and an explicit indication of time are
in bold type.)
Figure 83 gives the logical form for clause 2 in Figure 84. RASTA takes
note of the +NEG feature on the main verb make, but does not rule out a the
possibility of a SEQUENCE relation between clauses 2 and 3 since there is an
explicit indication of time in the form of the TMEAT attribute in the logical
form, annotated with the features +DATE +YEAR.
Figure 83 Logical form illustrating negative polarity
1. Although AIDS has been tracked since 1981,
2. the identification of HIV as the causative agent was not
made until 1983.
3. In 1985 the first blood test for HIV, developed by the research
group led by Robert Gallo, was approved for use in blood
banks.
Figure 84 Acquired Immune Deficiency Syndrome
1. Born in Paris, Moissan did not begin his formal academic
training
2. until he was in his early 20s…
3. In 1886 Moissan was appointed professor in toxicology at the
École Supérieure de Pharmacie…
4. In 1889 he became professor of inorganic chemistry at this
same institution
5. and the following year succeeded to the chair of inorganic
chemistry at the Faculté des Sciences.
Figure 85 Moissan, Ferdinand-Frederic-Henri
The negative clauses in Figure 84 and Figure 85 entail events which are in a
SEQUENCE relation with other events. The presence of an explicit indication of
time within a negative clause appears to be sufficient to identify this
entailment. Prepositional phrases and subordinate clauses introduced by until or
before are the most common means of explicitly indicating time for clauses
with negative polarity in Encarta.
Neither Wh questions (Criterion 7) nor attributive predicates (Criterion
8; see section 1 concerning attributive predicates) report events. They therefore
cannot participate in SEQUENCE relations. Changes in state, unlike attributive
predicates, can however participate in SEQUENCE relations. Clause 2 in Figure
86, and [Abacha] became a captain in the army in 1967, illustrates a change of
state.
1. Born in Kano, in northern Nigeria, Abacha graduated from the
Nigerian Military Training College in Zaria in 1963,
2. and became a captain in the army in 1967.
Figure 86 Abacha, Sani
Criteria 1 and 9 together constitute the traditional minimal definition of
a narrative (Labov 1972; Reinhart 1984): a narrative sequence is one in which a
series of tensed clauses report a sequence of events, with the linear order of the
clauses expressing the events matching the real-world temporal order of those
events. Criterion 9 is illustrated in Figure 89 below.
The last necessary condition for the SEQUENCE relation is Criterion 11,
“Clause2 must not be immediately governed by a contrast conjunction”. This
criterion is needed to ensure that in a handful of cases a more plausible
CONTRAST relation is selected over a possible SEQUENCE relation, as in the
following example:
At first Buthelezi opposed this system but then decided to
work within it. (Buthelezi, Mangosuthu Gatsha).
In this example, there are several cues suggesting temporal sequence: the
phrase At first, the conjunction then and the eventive clauses. The use of a
strongly cohesive device, the conjunction but, compatible with the CONTRAST
relation, favors a interpretation in which Buthelezi’s position at different times
is being contrasted rather than an interpretation in which events are merely
being cast as temporally ordered.
Additional cues for the SEQUENCE relation
Provided that the necessary criteria are satisfied, the heuristic score
associated with the hypothesized relation may be incremented if any of the
additional cues given in Figure 87 obtain.
Cue Heuristic score
Cue name
Clause2 contains a SEQUENCE conjunction (and
later then…)
10 H2
Clause1 and Clause2 are coordinated 5 H2b
There is an explicit indication that the event
expressed by Clause1 temporally precedes the
event expressed by Clause2.
5 H3
Figure 87 Cues for the SEQUENCE relation
The presence of a SEQUENCE conjunction (for example, and, later, then)
is not a necessary criterion, although it is a cue which receives a higher
heuristic score than any other single non-necessary cue for the SEQUENCE
relation. Figure 88 illustrates the application of cue H2 to identify the
SEQUENCE conjunction then in clause 2 and the instances of and in clauses 3
and 4. Cue H2b, “Clause1 and Clause2 are coordinated”, also identifies the
instances of and in clauses 3 and 4. This double identification is not redundant,
however. Since RASTA constructs RST trees from the bottom up in a binary-
branching manner (chapter 7), this double identification causes the cohesive
bond between clauses 2, 3 and 4 to be very strong indeed. By assigning a
greater heuristic score to reflect this strongly cohesive bond, RASTA ensures
that during the construction of RST trees, better analyses will be produced
earlier.
1. Napoleon met defeat in 1814 by a coalition of major powers,
notably Prussia, Russia, Great Britain, and Austria.
59.Napoleon was then deposed
60.and exiled to the island of Elba
61.and Louis XVIII was made ruler of France.
Figure 88 Waterloo, Battle of
Explicit indications of time are of great value in determining whether a
SEQUENCE relation is plausible. Criterion 9, “The event expressed in Clause2
does not temporally precede the event in Clause1; nor does the event expressed
in Clause2 occur within the time span covered by the event expressed in
Clause1” (Figure 82), is intended to exclude cases in which there is clear
counter-evidence, making a SEQUENCE relation unlikely. In Figure 89, for
example, the events described in clauses 2 through 7—conferences being held,
agreements being made, and so on—occur during the 1920s, the timeframe
described in clause 1. RASTA identifies the timeframe of the expression the
1920s by the presence of a definite article with a numeric year, together with
the presence of the plural suffix –s. The timeframe thus identified spans the
first day of 1920 to the last day of 1929. It is a matter of simple math to
determine that the dates 1920 (clause 2), 1921-1922 (clause 4), 1925 (clause 5)
and 1928 (clause 6) fall within this interval.
Clause 1 describes a temporal interval within which the events
described in clauses 2 through 7 of Figure 89 occur, rather than describing an
event that precedes the events of the remaining clauses. RASTA therefore does
not posit a SEQUENCE relation between clause 1 and any of the following
clauses. Rather, clause 1, the topic sentence of this paragraph, is in an
ELABORATION relation with the SEQUENCE node that spans clauses 2 through 7.
Clauses 2 through 7 satisfy criterion 9, since the temporal order of the
events described matches the temporal order of the events in the world and
none of the clauses describes a temporal interval within which the events of any
of the other clauses occurs. Cue H3 identifies the appropriate sequencing of the
temporal expressions in each of the relevant clauses, leading RASTA to posit the
SEQUENCE node depicted in Figure 89.
1. During the 1920s, attempts were made to achieve a stable
peace.
2. The first was the establishment (1920) of the League of Nations
as a forum in which nations could settle their disputes.
3. The league's powers were limited to persuasion and various
levels of moral and economic sanctions that the members were
free to carry out as they saw fit.
4. At the Washington Conference of 1921-22, the principal naval
powers agreed to limit their navies according to a fixed ratio.
5. The Locarno Conference (1925) produced a treaty guarantee of
the German-French boundary and an arbitration agreement
between Germany and Poland.
6. In the Paris Peace Pact (1928), 63 countries, including all the
great powers except the USSR, renounced war as an
instrument of national policy
7. and pledged to resolve all disputes among them “by pacific
means.”
Figure 89 World War II
The identification of temporal expressions is relatively simple in Figure
89, since all references are to years. Very often, however, temporal expressions
in Encarta differ in granularity. In Figure 91, for example, there are two
references of the form month-day-year (clause 1 and clause 2) and one
reference of the form month-year (clause 3). A simple function in RASTA
compares dates, allowing for differing granularities. The steps followed in this
function are illustrated in Figure 90. As soon as the function is able to
determine whether one date precedes another, it terminates. For example, in
comparing the dates February 26, 1815 and March 20, 1815 in Figure 91,
RASTA compares the years and finds that they are the same. It then compares
the months, and finds that March follows February. Having determined this,
RASTA does not need to compare the days (step 3) or the time (step 4).
1. Compare the years.
62. If the years are the same or at least one temporal expression
does not include a year, compare the months.
63. If the months are the same or at least one temporal
expression does not include a month, compare the days.
64. If the days are the same or at least one temporal expression
does not include a day, compare the time of day.
Figure 90 Compare dates
1. On February 26, 1815… Napoleon escaped from Elba…
65.and on March 20, 1815, he again ascended the throne.
66.On March 17 Austria, Great Britain, Prussia, and Russia each
agreed…
Figure 91 Waterloo, Battle of
MEG provides robust handling of dates and times expressed in a wide
range of formats. More elaborate processing of date and time information has
not yet proven necessary for the analysis of Encarta. Narrative clauses
containing relative expressions of time, for example two months later, tend also
to contain other cues that enable the correct identification of the SEQUENCE
relation.
7. Constructing Trees
7.1 Introduction
In this chapter I present the complete process by which RASTA
computes representations of the structure of a written text, from positing
discourse relations between clauses to producing and evaluating RST trees. As
noted above (sections 3.6 and 6.1), the general strategies described here for
constructing RST trees given a set of hypothesized relations would not be
affected if a different set of RST relations were used.
7.2 The need for an improved algorithm
The algorithm presented in Marcu (1996, 1997a) represents a
considerable advance in the formalization of a procedure for constructing RST
trees. Still, Marcu’s algorithm has a number of weaknesses:
1. No method is given for positing RST relations for clauses that do not
contain cue phrases.
2. The algorithm suffers from combinatorial explosion–as the number
of relations increases, the number of RST trees produced increases
exponentially (section 4.2). It is an unavoidable fact that the number
of well-formed RST trees increases exponentially as the number of
hypothesized relations increases. However, a great many of the trees
that are produced by Marcu’s algorithm are subsequently rejected as
ill-formed. The algorithm would be greatly improved if ill-formed
trees were not even constructed in the first place.
3. The metric for evaluating trees is specific to genre, working well
only for texts with a right-branching structure.
4. Only binary-branching trees are produced, whereas (with the
exception of Matthiessen and Thompson (1988)) n-ary branching
trees have generally been proposed in the RST literature.
RASTA improves on Marcu’s algorithm in the following ways:
1. RASTA contains an explicit means for positing RST relations based
on an examination of a text that is not wholly dependent on the
presence of cue phrases.
2. RASTA does not produce ill-formed trees. As soon as RASTA detects
that its bottom-up construction of a tree would lead to an ill-formed
tree, it aborts processing for all trees that would have contained the
tree fragment constructed so far. Furthermore, since RASTA
produces preferred RST analyses before dispreferred ones, it is not
usually necessary to compute all possible trees in order to find a tree
that an RST analyst would consider to be the most plausible analysis
of a text.
3. RASTA has a general domain-independent metric for evaluating
trees. Trees can be ranked by summing the heuristic scores of the
relations used to construct them.
4. RASTA produces n-ary branching trees.
7.3 Identify terminal nodes
For each sentence in the data being analyzed, the syntactic constituent
corresponding to each node in the logical form is examined in order to
determine whether it is a terminal node in an RST diagram. Figure 92 gives the
conditions that must be met for a node to be considered to be a terminal node in
an RST diagram.
1. The head of the constituent is a verb or the constituent is an
elliptical clause.
2. The head of the constituent is not an auxiliary.
3. An object complement is only allowed in the deontic have to
construction, for example: The pontiff allowed most of the English
customs, but Henry had to bow to canon law… (Beckett, Thomas a)
4. The constituent is not a subject complement.
5. Parse workarounds: (a) If the parent of the constituent is an NP then
the constituent can only be a terminal node in an RST diagram if it is
a present participial clause.13 (b) Detached participial clauses whose
head is a past participle cannot be terminal nodes.14
6. If the constituent is a complement clause, then it cannot have an NP
or PP as its parent.
13 To resolve a misparse in which a detached participial clause is incorrectly
subordinated to an NP. For example: This bold strategy gave them an advantage, thus creating
confusion. (Waterloo, Battle of, discussed in section 6.7.12).
14 In the following example the clause led by Charles Martel is currently parsed by
MEG as a detached participial clause, rather than as a participial clause subordinate to the NP
the Franks: His army met the Franks, led by Charles Martel, near Tours, France, later that
year. (Abd-ar-Rahman)
7. The constituent cannot be a relative clause.
8. The constituent cannot have a relative clause as one of its ancestors
(in order to avoid undue granularity; See section 3.2, footnote Error:
Reference source not found).
Figure 92 Criteria for an RST terminal node
Condition 1 allows elliptical clauses as terminal nodes, as in the
following example, discussed in section 6.7.4.
1. Although timid,
2. the aardvark will fight
3. when it cannot flee.
Figure 93 Aardvark
7.4 Posit hypotheses
The result of identifying the terminal RST nodes according to the
criteria given in section 7.3 is a set of terminal nodes. Given this set of terminal
nodes, RASTA examines each pair of clauses in order to determine which
rhetorical relations to posit according to the criteria in section 6.7.
For a set of n clauses, n(n-1) pairs of clauses are examined to see if any
of the thirteen rhetorical relations (section 3.5) ought to be posited. In practice
these inspections can be performed at very little computational expense. For
example, a violation of any one of the necessary criteria for a rhetorical relation
is sufficient grounds to preclude further consideration of that relation.
A set of hypothesized relations results from the pairwise examination of
clauses. Each of these hypotheses is a simple data structure consisting of
attributes and values. Figure 94 illustrates the data structure used to represent a
symmetric relation. The Nodename is an internal designation used within MEG
to refer to this hypothesis. The value of the Nodename attribute is simply value
of the Pred attribute combined with a unique integer (i.e. the first record of this
type will be called RSTrec1, the second RSTrec2, and so on). The
RelationValue and TreeValue attributes contain the sum of the heuristic scores
of the cues which led to this relation being posited. (In subsequent processing,
the TreeValue attribute will be incremented as nodes are joined together; see
section 7.5.3.) The Relations attribute is a list of the names of relationships
which might hold between these two nodes. (As noted in section 3.6, it is
sometimes the case that several RST relations could equally well be said to hold
between two nodes.) Although the CONTRAST relation is symmetric, i.e. it
contains two nuclei, in the initial representation I distinguish a Nucleus and a
CoNucleus. This distinction is motivated solely by the desire to simplify
processing by creating a structure that is analogous to the Nucleus / Satellite
distinction in Figure 95. During the transformation of a binary to an n-ary tree,
the Nucleus and CoNucleus attributes become members of a list of nuclei. The
values of the Nucleus and CoNucleus attributes are pointers to nodes in the
logical form. Finally the attribute Heurs contains debugging information
concerning which cues were involved in hypothesizing this rhetorical relation
and the heuristic score associated with each cue, in this case cue number 39
with a value of ten and cue number four with a value of 25. The following
excerpt was analyzed by RASTA to produce the hypothesized relation given in
Figure 94.
It is usually placed in the hyena family, Hyaenidae. Some
experts, however, place the aardwolf in a separate family,
Protelidae… (Aardwolf)
Nodename RSTrec1
RelationValue 35
TreeValue 35
Pred RSTrec
Relations (Contrast)
Nucleus place1
CoNucleus place2
Heurs (H39:10 H4:25)
Figure 94 Data structure of a hypothesized symmetrical rhetorical relation
Figure 95 illustrates the data structure used to represent an asymmetric
relation. This data structure differs from the one in Figure 94 in only one
attribute: the attribute Satellite occurs in place of the attribute CoNucleus.
This bold strategy gave them an advantage, creating
confusion. (Trafalgar, Battle of)
Nodename RSTrec1
RelationValue 15
TreeValue 15
Pred RSTrec
Relations (Result)
Nucleus give1
Satellite create1
Heurs (H22:15)
Figure 95 Data structure of a hypothesized asymmetric rhetorical relation
7.5 Construct trees
Given a set of terminal nodes, and a set of relations that have been
hypothesized to hold between those terminal nodes, the task is to construct and
evaluate the possible RST trees. RASTA operates from the bottom up, permuting
the hypothesized relations and gathering terminal nodes into contiguous text
spans.
7.5.1 Promotion sets
Marcu (1996) employs the notion of a promotion set for an RST sub-
tree, similar to the syntactic notion of the head of a constituent. Promotion sets
are used to guide the production of RST trees, constraining the structures that
are produced: A relation can be said to hold between two text spans a and b if
and only if that same relation can be said to hold between the members of the
promotion set of a and the members of the promotion set of b. For a terminal
node, the promotion set consists only of the terminal node itself. For an
asymmetric RST sub-tree, the promotion set consists of a single element, the
nucleus. For a symmetric RST sub-tree, the promotion set consists of the union
of the promotion sets of the co-nuclei. The notion of a promotion set is central
to the algorithm used for constructing RST trees from the bottom up (section
7.5.3). During the production of an RST tree, RASTA observes the constraint
that the same relation must hold between all members of the promotion set of a
and all members of the promotion set of b, and is thus able to avoid the
production of ill-formed trees. The notion of a promotion set is perhaps best
explained by considering the bottom-up construction of the Rhetorical
Structure sub-trees given in Figure 96 and Figure 97, leaving aside for the
moment details about the basis for positing the relations.
Figure 96 depicts a binary-branching tree representing the structure of
an excerpt from the Abd-ar-Rahman article. (This structure would be converted
into an n-ary branching representation, as described in section 7.5.4). Clauses 2
and 3 are in a Circumstance relation, with clause 2 as the nucleus and clause 3
as the satellite. Since the Circumstance relation is asymmetric, the promotion
set of the text span covering clauses 2 and 3 consists of a single element, the
nucleus, clause 2. Clause 4 is in a SEQUENCE relation with the single member of
the promotion set of the text span covering clauses 2 and 3. This SEQUENCE
relation yields the text span covering clauses 2 through 4. Since the sequence
relation is symmetric, the promotion set of this text span is equal to the union
of the promotion set of the two co-nuclei, i.e. {2, 4}. Since clause 1 is in a
SEQUENCE relation with both members of the promotion set of the text span
covering clauses 2 through 4 (i.e. clause 1 must be in a sequence relation with
clause 2 and with clause 4), clause 1 can be said to be in a SEQUENCE relation
with the text span covering clauses 2 through 4. The promotion set of this new
text span covering clauses 1 through 4 is the union of the promotion set of
clause 1 (the terminal node itself) and the text span covering clauses 2 through
4 ({2, 4}).
1. He became governor of southern France in 721.
2. In 732, … , he led an army across the Pyrenees Mountains
into the dominions of the Franks.
3. when the growth of Frankish power menaced the Muslim
position in Spain
4. His army met the Franks,…
Figure 96 Binary-branching tree for Abd-ar-Rahman excerpt
In Figure 97, the ELABORATION relation must hold between the
promotion of clause 1 (the terminal node itself) and both members of the
promotion set {2, 3}, i.e. there must be an ELABORATION relation between
clauses 1 and 2 and between clauses 1 and 3. Since the ELABORATION relation
is asymmetric, the promotion set of the resulting text span consists of a single
element, clause 1.
1. The aardwolf is classified as Proteles cristatus
2. It is usually placed in the hyena family, Hyaenidae
3. Some experts, however, place the aardwolf in a separate
family, Protelidae…
Figure 97 Binary-branching tree for Aardwolf excerpt
7.5.2 Group mutually exclusive hypotheses
RASTA often posits more than one relation between two terminal nodes.
Relations of the same type, i.e. symmetric or asymmetric, are merged into an
underspecified representation (section 3.6) if they have the same heuristic
score. In that case, what is represented is whether the relation is symmetric or
asymmetric, the heuristic score associated with that relationship and a list of
possible labels. Figure 98 illustrates the data structure used to represent a case
where two asymmetric relations, RESULT and ELABORATION were posited for
the same two terminal nodes. The Relations attribute of one hypothesized
relation contained the value RESULT, and the Relations attribute of the other
hypothesized relation contained the value ELABORATION. In merging the two
relations into a single underspecified relation, a new Relations attribute was
constructed, containing the union of the Relations attributes of the two
relations: {RESULT, ELABORATION}.
Nodename RSTrec1
RelationValue 15
TreeValue 15
Pred RSTrec
Relations (Result Elaboration)
Nucleus give1
Satellite create1
Heurs (H22:15)
Figure 98 Data structure of an underspecified asymmetric rhetorical
relation
Even after appropriate relations have been merged into an
underspecified representation, there are likely to be mutually exclusive
relations in the set of relations used to construct RST trees. If relations are
successively applied in the construction of RST trees, then applying mutually
exclusive relations will clearly lead to wasted computational effort in the search
for possible RST trees. Mutually exclusive relations linking the same two
terminal nodes are therefore grouped together into sets called “bags”. For
example, if a SEQUENCE relation is applied to clauses a and b, then an
ELABORATION relation linking a and b cannot subsequently be applied. If there
is only one hypothesized relation linking two clauses, then a bag is created
consisting solely of that relation.
An important goal of RASTA is to produce preferred RST trees before
dispreferred ones by using the highest scoring hypotheses first. RASTA
therefore sorts the relations within each bag according to heuristic score.
Finally, a list of bags is produced, with each bag occurring in sorted order
according to the value of the highest ranking relation within it.
7.5.3 Produce and rank binary-branching trees
In the outline of the algorithm below, I refer to the following variables:
SUBTREES: a list of the RST subtrees constructed so far. Since RASTA only
produces well-formed trees, all members of this list are guaranteed to be
well-formed trees. Initially, SUBTREES contains a list of the RST terminal
nodes. As nodes corresponding to contiguous text spans are grouped
together to form larger text spans, SUBTREES contains fewer and fewer
members. When SUBTREES contains a single member, RASTA has succeeded
in constructing a complete RST tree to represent the text. If RASTA
unsuccessfully applies all posited hypotheses in an effort to construct a tree,
then SUBTREES will contain more than a single element.
HYPOTHESES: a list of bags of hypotheses. Hypotheses within bags are
sorted according to heuristic score. Bags are initially sorted according to the
heuristic score of the first element.
ALLHYPOTHESES: an unordered list of all the hypotheses posited.
At the most abstract level, the algorithm can be described as follows:
Construct all RST trees compatible with the set of the
hypotheses by gathering up the text into contiguous text
spans. Store each unique analysis that covers the entire text.
Figure 99 gives a pseudo-code description of a function
CONSTRUCTTREE that constructs binary-branching RST trees. To aid the reader,
comments occur in italics following two forward slashes.
If allowed to run to completion, CONSTRUCTTREE would create all
possible well-formed RST trees that are compatible with the hypothesized
discourse relations. As actually implemented, however, the researcher specifies
a desired number of trees—usually ten or twenty. CONSTRUCTTREE then
produces either the stipulated number of trees or all possible trees, whichever is
the smaller number. Since the algorithm produces better trees first, it is usually
not necessary to produce many trees before an analysis is produced that an RST
analyst would consider to be plausible.
The recursive, back-tracking nature of CONSTRUCTTREE prevents the
construction of a great number of ill-formed trees. For example, consider an
imaginary set of five RST hypotheses, R1… R5, where applying R2 after R1
results in an invalid tree. Rather than attempting to construct RST hypotheses by
testing all permutations of these five hypotheses and then examining the trees
only to discover that trees formed by applying {R1 R2 R3 R4 R5} or {R1 R2 R3 R5
R4}and so on were invalid, CONSTRUCTTREE applies R1, then R2. It
immediately determines that an ill-formed subtree results, and so does not
bother to complete the construction of any trees that would follow from those
first two steps. A total of six trees are thus not even produced, resulting in
considerable gains in efficiency.
Function ConstructTree (HYPOTHESES, SUBTREES)Begin Function ConstructTreeLet COPYHYPOTHESES be equal to a copy of the list HYPOTHESES.If the desired number of trees has been constructed
Return.Else If SUBTREES has only one element:
If this RST tree is not identical to one that has already been stored, then store it.Return.
Else If COPYHYPOTHESES contains at least one element and SUBTREES has more than one element ThenFor each bag in COPYHYPOTHESES
Let ONEBAG denote the current bag.Let REMAININGBAGS be equal to COPYHYPOTHESES except the current bag.If projections of elements in SUBTREES match the nucleus and other element (satellite or co-nucleus) specified by the hypothesized relations in ONEBAG, then
For each hypothesis in ONEBAG, going from the highest scored hypothesis to the lowest scored:
Let ONEELEMENT denote the current hypothesis1. Search in SUBTREES for elements with the promotions specified by
ONEELEMENT. 2. Let NUC be the subtree whose promotion set includes the nucleus specified by
ONEELEMENT.3. Let OTHER be the subtree whose promotion set includes the other member (a
satellite or a co-nucleus) specified by ONEELEMENT.4. In ALLHYPOTHESES, there must be a hypothesized relation between every
member of the promotion set of NUC and every member of the promotion set of OTHER. The relation must be the same as the one specified by ONEELEMENT.
If (4) is true // begin processing the subtrees whose promotions satisfy ONEELEMENT.If combining the subtrees would result in an RST tree with crossing lines, then return.Let REMAININGRSTSUBTREES equal SUBTREES.Remove NUC and OTHER from REMAININGRSTSUBTREES.Create a new subtree by joining NUC and OTHER as specified by ONEELEMENT.Set the RelationValue attribute of this new subtree equal to the heuristic score of the hypothesis used to join these two nodes.Set the TreeValue attribute of this new subtree equal to the heuristic score of the hypothesis used to join these two nodes plus the TreeValue of NUC plus the TreeValue of OTHER.Add this new subtree as the first element of REMAININGRSTSUBTREES.ConstructTree (REMAININGBAGS, REMAININGRSTSUBTREES).If the desired number of trees have been constructed, then return.
End If // End processing the subtrees whose promotions satisfy ONEELEMENT.Do the next element in this bag until there are no elements left to do.
Else // the projections are not found –this bag can therefore not apply in any subsequent permutation
Remove ONEBAG from COPYHYPOTHESESEnd If
Do the next bag in COPYHYPOTHESES until there are no bags left to do.// End the processing of the remaining bags.
Else // HYPOTHESES is empty.Return.
End IfEnd Function ConstructTree
Figure 99 Pseudo-code for constructing RST trees
The trees produced by CONSTRUCTTREE are stored in a list. The
TreeValue attribute of the root node of each tree can be used to evaluate a tree;
since the TreeValue attribute is determined by adding the heuristic scores of the
relations used to construct the tree, a tree constructed by using relations with
high heuristic scores will have a greater TreeValue than a tree constructed by
using relations with low heuristic scores. Ideally, CONSTRUCTTREE ought to
produce highly ranked trees produced before low ranked ones. Unfortunately,
CONSTRUCTTREE occasionally produces trees out of sequence. To correct this
anomalous situation, the list of trees produced by CONSTRUCTTREE is sorted
according to the TreeValue attribute of the root node of each tree, to ensure that
a tree judged by an RST analyst to be the preferred analysis for the text occurs
as the top ranked tree, with alternative plausible analyses also occurring near
the top of the sorted list.
Why then does CONSTRUCTTREE occasionally produce trees out of
sequence? Consider the following hypothetical example of seven relations
grouped into three bags: bag A, containing the three relations {a1 a2 a3}, bag B,
containing {b1 b2}, and bag C, containing {c1 c2}. These bags are illustrated in
Table 1. Heuristic scores associated with the relations are given in parentheses.
A B Ca1 (15) b1 (10) c1 (5)a2 (7) b2 (5) c2 (3)a3 (2)
Table 1 Three bags of hypotheses
Given these hypothesized relations, RASTA would first order the bags into the
list {A, B, C}, and would then apply relations from each bag as illustrated in
Table 2.
Iteration Hypotheses1 a1 (15) b1 (10) c1 (5)2 a1 (15) b1 (10) c2 (3)3 a1 (15) b2 (5) c1 (5)4 a1 (15) b2 (5) c2 (3)5 a2 (7) b1 (10) c1 (5)6 a2 (7) b1 (10) c2 (3)7 a2 (7) b2 (5) c1 (5)8 a2 (7) b2 (5) c2 (3)… … … …
Table 2 Permutations of hypotheses
As Table 2 shows, during the fifth and sixth iterations, CONSTRUCTTREE
applies relation a2, whose heuristic score is 7, before relation b1, whose
heuristic score is 10. Applying the relations in this order could lead to an RST
tree with a lower overall score than one that would be produced by applying
relation b1 before relation a2. This might suggest that a more elaborate method
is needed for permuting the relations in order to ensure that trees are never
produced out of sequence. In practice, however, producing trees by applying
hypotheses out of sequence does not create problems, for the following reasons:
1. Applying the hypotheses out of sequence often does not lead to
well-formed RST trees in any case.
2. The final list of valid RST trees is sorted according to the TreeValue
attribute of the root node, thus ensuring that the trees are correctly
ordered.
3. The primary focus of this research is on the top-ranked RST tree.
The exact ordering of other trees is not of paramount importance.
The heuristic scores of the hypothesized relations are continually
being adjusted (section 7.5.5) to ensure that the preferred analysis is
produced within the desired number of trees. If the preferred
analysis occurs within the desired number of trees and the optimal
heuristic scores are used, the preferred analysis will percolate to the
top of the list during the sorting phase mentioned in (2).
7.5.4 Produce n-ary branching trees
The trees produced by CONSTRUCTTREE (section 7.5.3) are binary-
branching, whereas (with the exception of Matthiessen and Thompson (1988))
n-ary branching trees have generally been proposed in the RST literature. The
fact that n-ary branching trees are preferred over binary-branching ones in the
RST literature is not the only motivation for producing n-ary branching trees –
preliminary experiments suggest that n-ary branching trees are more useful for
producing summaries of texts by a method that prunes satellites in an RST tree.
For research purposes, RASTA produces as many binary-branching trees as
desired, and then transforms the top ranked tree into an n-ary branching tree.
The n-ary branching tree is derived from a binary-branching tree by means of a
simple tree traversal.
The derivation of an n-ary branching tree from a binary-branching one
relies crucially on the notion of nuclearity (section 4.2). For example, if two
text spans a and b are in a symmetric relation R to form an RST node N, then
the promotion set of the resulting node consists of the union of the promotion
set of a and the promotion set of b (section 7.5.1). A node c can only be
relation R to the node N created by a and b if the relation R can be plausibly
hypothesized to hold between all members of the promotion set of N and all
members of the promotion set of c. This then amounts to saying that relation R
holds between any two members of the set that results from taking the union of
the promotion sets of a, b, and c. This is illustrated in Figure 100. A structure
with the form illustrated by Figure 100 (a) will be converted into a structure of
the form illustrated by Figure 100 (b), a structure like Figure 100 (c) will be
converted into a structure like Figure 100 (d) and a structure like Figure 100 (e)
will be converted into a structure like Figure 100 (f).
Figure 100 Corresponding binary and n-ary branching symmetric RST
trees
The case of n-ary branching asymmetric relations is similar to that of
symmetric n-ary branching relations, except that binary-branching asymmetric
relations can be transformed into n-ary branching ones irrespective of the
relation that holds between the nucleus and the satellite. Figure 101 illustrates
transformations of binary-branching asymmetric RST trees into n-ary branching
RST trees. In Figure 101 (a), for example, node 1 is in a Circumstance if and
only if node 1 is in a Circumstance relation with node 3, since node 3 is the
single member of the promotion set of the text span covering nodes 2 through
3. Since both node 1 and node 2 are in a dependency relationship to node 3, the
binary-branching tree can be transformed into an n-ary branching structure like
that in Figure 101 (b). Similarly, the structure represented in Figure 101 (c) can
be transformed into the n-ary branching structure represented in Figure 101 (d).
Figure 101 Corresponding binary and n-ary branching asymmetric RST
trees
Figure 102 illustrates the transformation of a complex tree involving
symmetric and asymmetric relations.
Figure 102 Corresponding binary and n-ary branching complex RST trees
The function BinaryToNaryTree performs a depth-first traversal of an
RST tree, converting binary-branching structures to n-ary branching ones as it
returns to the root node. The pseudo-code for BinaryToNaryTree is given in
Figure 103.
Function BinaryToNaryTree (CURRENTNODE)Begin Function BinaryToNaryTree
If CURRENTNODE is not a terminal nodeBinaryToNaryTree (Nucleus(CURRENTNODE)) // Process the subtree inside the nucleusIf CurrentNode has a co-nucleus
BinaryToNaryTree (CoNucleus). // Process the subtree inside the co-nucleusIf CURRENTNODE is the root of a structure like those illustrated in Figure 100 (a), (c) or (e) then
Transform the structure into its n-ary branching counterpart.End If
Else If CURRENTNODE has a satelliteBinaryToNaryTree (Satellite); // Process the subtree inside the satellite.If CURRENTNODE is the root of a structure like the one illustrated in Figure 101 (a), i.e. a satellite modifying a subtree that contains a satellite modifying a subtree, then
Transform the structure into its n-ary branching counterpart.End If
End If // Does CurrentNode have a co-nucleus or a satellite?Else If CURRENTNODE is a terminal node
Return.End If
End Function BinaryToNaryTree
Figure 103 Pseudo-code for the function BinaryToNaryTree
7.5.5 Learning the heuristic scores
The heuristic scores presented in this study were derived by trial and
modification. The initial values used were based on the author’s intuitions as a
linguist. For example, conjunctions are extremely good discriminators of
particular discourse relations, whereas tense and aspect are weaker
discriminators. These initial heuristic scores were then modified to ensure that
preferred trees occurred at the top of the ranked list of RST trees.
In the course of my research I have been constructing a regression test
set. This is a file containing excerpts from Encarta together with their preferred
RST analyses. New data necessitate changes in RASTA. These new changes can
always be checked to determine whether they would prevent RASTA from
producing the preferred analyses in the regression test set. Very often, new data
suggest a new cue to a discourse relation, or a modification to an existing cue.
Associated with the new or modified cue is a heuristic score, which must be
adjusted until RASTA produces preferred analyses for both the new data and the
members of the regression test set.
Researchers developing grammars have access to annotated corpora
such as the Penn Treebank (Marcus et al. 1993). Such sources provide
externally verified analyses of part of speech and constituency, and are
invaluable for those desiring to evaluate grammars or to train grammars that
involve machine learning or a statistical component. Given a similar corpus of
texts annotated with RST analyses, it ought to be possible to automatically learn
the optimal values for the heuristic scores of the discourse cues. Unfortunately,
no widely available corpora of RST-analyzed texts exist. Hand-tuning in order
to determine the optimal heuristic scores is therefore still necessary.
Although the heuristic scores given in this dissertation suffice to
produce the desired RST analyses, it is possible that the actual scores are not
optimal. Since the heuristic scores guide RASTA to produce better trees first, it
might be possible to find a different set of heuristic scores that causes RASTA to
produce the same preferred analyses but to do so more quickly. An automated
learning algorithm could therefore test different heuristic scores for each cue
against the current regression files in order to determine whether a better set of
scores exists than the one currently in use. Since the space to be searched for a
better set of heuristic scores is so large (for some fifty heuristic scores with
possible values in the range 1-50, there are 5050 possible vectors to be tested), I
first measured the performance of the current set of heuristic scores. For the
tree that eventually emerged as RASTA’s number one choice, a note was made
of the order in which that tree had originally been produced, for example, the
eventual number one ranked tree was actually the third tree produced. Figure
104 gives the results for one regression test set, consisting of 59 excerpts. The
NthTree row gives the order in which the tree eventually ranked as number one
was produced. The Total Trees row gives the corresponding total number of
well-formed RST trees constructed for the excerpt. RASTA was instructed to
produce up to one thousand trees for each excerpt. For example, the first
excerpt in the regression set yielded twelve RST trees. The fifth tree produced
was the one with the highest overall score, and so when the trees were sorted it
ended up as the number one ranked tree.
NthTree 5 1 1 1 1 1 3 1 1 1 1 3 3 1 1 1 1 1 1 34Total Trees 12 1 1 1 1 1 3 1 1 1 1 4 3 1 1 1 1 1 1 73
NthTree 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 9Total Trees 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 3 2 14
Nth Tree 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Total Trees 1 1 1 2 12 1 1 1 1 1 1 3 1 1 1 1 1 1 1
Figure 104 Rankings of RST trees
For this regression set, I calculated some simple measures of the
performance of RASTA. The average value of the NthTree attribute was 1.92,
with a standard deviation of 4.39. With the exception of a single pathological
case (NthTree = 34), the number one ranked tree was produced within the first
nine trees.
The most common result of this experiment was that there was only a
single tree possible. Producing the number one ranked tree first when there is
only one possible tree might not appear to be an interesting result. In fact, this
result represents a ringing endorsement of RASTA. The number of possible trees
grows in proportion to the number of relations that RASTA hypothesizes. One
research goal is therefore to impose stringent conditions so that relations are
hypothesized where appropriate but not hypothesized too liberally, i.e. the
search for plausible analyses is often best constrained at the very earliest stages
of processing, when relations between nodes are being hypothesized.
Although there might exist a set of heuristic scores that would cause
RASTA to produce the same number one ranked trees but to do so more quickly,
the search for those numbers, even by automated means, is likely to produce
only marginal improvements in the system. This is perhaps not surprising. As a
linguist identifying cues to discourse structure, the author is able to rely on
intuitions as both a linguist and a native speaker of English to determine likely
ranges of values for the heuristic score associated with a cue. For example,
conjunctions associated with an RST relation are intuitively strong indicators.
We can therefore test a high initial value, for example 25, and determine the
effects. In some cases, the heuristic score of the new hypothesis or the scores of
existing hypotheses might have to be modified in order to achieve the preferred
analysis for the excerpt that motivated the new cue. The new cue and its
associated heuristic score are then tested to ensure that the interaction of the
new cue with other cues in the system does not cause any texts that were
previously analyzed correctly to now be analyzed incorrectly, i.e. to ensure that
no texts in the regressions set are affected. While the search space within which
an ideal set of heuristic scores might be located is incredibly large, in practice
the space to be searched during the development of RASTA is extremely small,
since the space is constrained by the linguist’s intuitions. Since the automated
search for an optimal set of heuristic scores promises only marginal
improvements, it is perhaps best left for future research, once a great number of
RST analyses have been constructed and verified as appropriate for training a
learning algorithm.
7.6 Worked example
Let us now turn to a close examination of the operation of RASTA by means of
a worked example. The text in Figure 105 forms the basis of the worked
example.
1. The aardwolf is classified as Proteles cristatus.
67. It is usually placed in the hyena family, Hyaenidae.
68.Some experts, however, place the aardwolf in a separate
family, Protelidae, because of certain anatomical differences
between the aardwolf and the hyena.
69.For example, the aardwolf has five toes on its forefeet,
70.whereas the hyena has four.
Figure 105 Aardwolf
The syntactic analyses and logical forms produced for the sentences in
this excerpt are given in Figure 106 to Figure 108.
Figure 106 Analysis of the first sentence
Figure 107 Analysis of the second sentence
Figure 108 Analysis of the third sentence
Figure 109 Analysis of the fourth sentence
The node labelled DUMMY in the logical form given in Figure 109
represents the unresolved elliptical head of an NP. Verb phrase anaphora is
handled well in MEG; NP anaphora however is still under development.
These syntactic parses contain a few minor errors. In Figure 108, for
example, the phrase because of certain anatomical differences between the
aardwolf and the hyena ought to be dependent on the VP whose head is the
verb place. In Figure 107, the logical form label MANR is not correct. Finally,
the classification Proteles cristatus ought to have been identified as a single
constituent, namely a noun-noun compound. Despite these minor errors in the
syntax and logical form, RASTA is still able to posit plausible representations of
the structure of this excerpt.
Given these parses and logical forms, RASTA is able to identify the five
clauses given in Figure 105 as being terminal nodes in an RST analysis. RASTA
then examines all pairs of clauses in this excerpt to produce the hypothesized
discourse relations given in Figure 111. These hypothesized relations are then
grouped into bags of mutually exclusive relations. Relations within bags are
ranked in descending order of their heuristic score. Finally, the bags are ranked
in descending order according to the heuristic score of their initial elements.
For this example, the hypothesized relations 2 and 3, which concern relations
joining clauses 1 and 3, are grouped together into a single bag. Other pairs of
clauses yielded only a single hypothesized relation or no hypothesized
relations. The bags are given in Figure 110. Note that the relative order of Bag1
and Bag2 is arbitrary, since both bags contain a single hypothesized relation
with a score of 35.
Bag # Relation number (from Figure 111) and score
1 4: Score = 35
2 5: Score = 35
3 6: Score = 30
4 1: Score = 27
5 2: Score = 25; 3:Score = 20
Figure 110 Bags for the excerpt
# Name Clauses Cues and bases for cues Total
1 ELABORATION 1, 2 H25a: Usually in Clause2.
H25: The clauses are not
coordinated and they exhibit subject
continuity since it is coreferential
with The aardwolf.
27
2 CONTRAST 1, 3 H4: However in Clause3. 25
3 ELABORATION 1, 3 H38: The syntactic subject of
Clause3 is modified by some.
H25: Clause1 is passive and the Dobj
of Clause1 has the same head as the
Dobj of Clause3 (aardwolf).
20
4 CONTRAST 2, 3 H39: The two clauses have the same
main verb.
H4: Clause3 contains however.
35
5 ELABORATION 3, 4 H24: Clause4 contains for example
and is the sentence immediately
following Clause3.
35
6 ASYMMETRIC-
CONTRAST
4, 5 H20: Clause5 contains whereas. 30
Figure 111 Hypothesized relations for the excerpt
RASTA occasionally makes reference to the original list of hypothesized
relations. This original list is called ORIGINALHYPOTHS in the discussion
below.
Each of the clauses identified as a terminal node initially has a single
projection, the clause itself. RASTA thus begins with the terminal nodes given in
Figure 112. (In the diagrams in this section, projections are written in curly
braces.) This list of nodes is referred to below as RSTNODES.
Figure 112 Terminal nodes and initial projections
RASTA begins with bag 1, and attempts to apply the first hypothesized
relation, relation 4. This relation specifies a CONTRAST relation between clause
2, It is usually placed in the hyena family, Hyaenidae, and clause 3, Some
experts however, place the aardwolf in a separate family, Protelidae, because
of certain anatomical differences between the aardwolf and the hyena. RASTA
searches RSTNODES for a node whose projections include clause 2 and a node
whose projections include clause 3. RASTA finds these two nodes. RASTA
removes the nodes from RSTNODES, and combines them to form a new node
covering clauses 2 and 3, and adds this new node back into RSTNODES.
RSTNODES now contains the elements given in Figure 113.
Figure 113 Contents of RSTNODES after applying hypothesis 4
RASTA now permutes the other bags, i.e. bags 2, 3, 4, 5. In the first
permutation, the first bag is bag 2. RASTA attempts to apply the first
hypothesized relation in bag 2, hypothesis 5, which specifies an ELABORATION
relation with clause 3, Some experts however, place the aardwolf in a separate
family, Protelidae, because of certain anatomical differences between the
aardwolf and the hyena, as the nucleus and clause 4, For example, the aardwolf
has five toes on its forefeet, as the satellite. RASTA searches in RSTNODES for a
nodes whose projections include clause 3 and a node whose projections include
clause 4. Nodes with these projections are found in RSTNODES. The node
whose projections include clause 3, the CONTRAST node resulting from the
application of the first hypothesis in bag 1, also includes clause 2 in its
projections. RASTA can only attach clause 4 as a satellite of this node if
ORIGINALHYPOTHS includes an ELABORATION relation with clause 2 as a
nucleus and clause 4 as a satellite. Since no such relation was hypothesized, it
does not occur in ORIGINALHYPOTHS. RASTA is therefore unable to attach
clause 4 as a satellite of this node.
If bag 2 contained more hypothesized relations, RASTA would at this
stage move on to consider them. Since bag 2 only contains a single relation,
RASTA has completed processing of the current bag and moves on to bag 3.
The first hypothesized relation in bag 3, relation 6, specifies an
ASYMMETRICCONTRAST relation, with clause 4, For example, the aardwolf has
five toes on its forefeet, as the nucleus and clause 5, whereas the hyena has
four, as the satellite. RASTA finds nodes whose projections include these two
clauses and creates a new node covering clauses 4 and 5, as illustrated in Figure
114.
Figure 114 Contents of RSTNODES after applying hypothesis 6
RASTA now permutes the other bags, i.e. bags 2, 4, 5. In the first
permutation, the first bag is bag 2. As noted above, bag 2 contains a single
hypothesized relation that cannot be applied, despite the presence of the
projections specified by the relation. RASTA therefore moves onto bag 4,
applying relation 1. Relation 1 specifies an ELABORATION relation with clause
1, The aardwolf is classified as Proteles cristatus, as the nucleus and clause 2,
It is usually placed in the hyena family, Hyaenidae, as the satellite. Nodes with
the requisite projections are found. Clause 2 occurs in a node with another
projection, clause 3. Since ORIGINALHYPOTHS contains an ELABORATION
relation, with clause 1 as the nucleus and clause 3 as the satellite, RASTA
constructs a new node covering clauses 1 through 3, as illustrated in Figure
115.
Figure 115 Contents of RSTNODES after applying hypothesis 1
RASTA now permutes the other bags, i.e. bags 2 and 5. In the first
permutation, the first bag is bag 2. In RSTNODES, RASTA is unable to find the
two projections that the hypothesized relations in bag 2 cover, namely clauses 3
and 4. RASTA therefore prunes all nodes in the search space that follows from
the current permutation by removing bag 2 from further consideration. In this
particular example, bag 2 contains a single hypothesis and the removal of bag 2
leaves only a single bag, bag 5. Frequently15, however, a bag is removed and
several bags remain. One of these remaining bags is removed and so on, with
the result that the search space is considerably reduced. Measurements of
15 In cases where a trace of the program execution would span too many pages for
any human reader to endure.
RASTA’s execution indicate that pruning the search space reduces the number of
passes through the loop that moves from one bag to the next by approximately
one third.
RASTA now moves on to consider bag 5. As with bag 2, RASTA is not
able to find both projections specified by the hypothesized relations in bag 5.
RASTA therefore removes bag 5 from further consideration. Since no bags now
remain, RASTA backs up to the point illustrated in Figure 114, and continues
processing. Eventually, after RASTA has pursued other dead ends, RSTNodes
contains the two nodes illustrated in Figure 116.
Figure 116 Contents of RSTNODES after further processing
RASTA then attempts to apply hypothesized relation 1 from bag 4. This
relation specifies an ELABORATION relation with clause 1, The aardwolf is
classified as Proteles cristatus, as the head and clause 2, It is usually placed in
the hyena family, Hyaenidae, as a satellite. Both clause 1 and clause 2 are
available in the projections of nodes in RSTNODES. Clause 2 occurs as the
projection of a node whose projections also include clause 3, Some experts
however, place the aardwolf in a separate family, Protelidae, because of
certain anatomical differences between the aardwolf and the hyena. Because
ORIGINALHYPOTHS also includes an ELABORATION relation with clause 1 as the
nucleus and clause 3 as the satellite, RASTA joins clause 1 and the CONTRAST
node that covers clauses 2 through 5. RSTNODES now contains a single node.
This single node is an RST tree covering clauses 1 through 5, as illustrated in
Figure 117.
Figure 117 First complete RST tree for Aardwolf excerpt
As a final stage, RASTA converts the binary-branching tree to an n-ary
branching tree. For this particular tree, the result of this conversion is a tree
with exactly the same form as the tree in Figure 117.
The tree produced first for this excerpt happens to be the one that I
consider to be the preferred analysis for this text. If left to run, however, RASTA
produces other analyses. The tree given in Figure 117 has an overall score of
127. All the other trees produced by RASTA have scores less than 127. Since
RASTA sorts the trees according to their overall score, no subsequent tree ousts
the tree in Figure 117 from its number one position.
8. RASTA’s contributions to the field
8.1 Introduction
RASTA is a discourse analysis module that efficiently constructs RST
trees to represent the structure of written texts. Having presented in chapters 6
and 7 the processes by which RASTA constructs these representations, let us
turn to a consideration of the practical and theoretical implications of the
approach adopted in this dissertation.
8.2 Identifying rhetorical relations
There is a spectrum of opinion in the discourse literature concerning the
manner in which rhetorical relations might be identified. At one extreme are
those who attempt to identify rhetorical relations primarily on the basis of cue
words and phrases (Sumita et al. 1992; Ono et al. 1994; Kuohashi and Nagao
1994; Marcu 1997a). At the other end of the spectrum are those who make
unrestricted appeals to knowledge extrinsic to the text (Hobbs 1979; Polanyi
1988). An agnostic middle view is articulated by Mann and Thompson (1986),
who concede that the form of a text sometimes correlates with RST relations but
ask:
“Are there other, more subtle attributes of form in text which
might be signaling the relations in the absence of conjunctions
or subordinators? For example, could the sequence of
declarative mood followed by imperative mood be signaling
“solutionhood”…? We doubt that there are such signals
expressed in text form on several grounds… Whatever other
signals there are, they must be derivable from large units of texts
as well as from single sentences, but large units of text can have
very diverse forms. We recognize that such patterns can be
suggestive of a relation, or perhaps restrict the range of possible
relations, but we do not believe that there are undiscovered
signal forms, and we do not believe that text form can ever
provide a definitive basis for describing how relational
propositions can be discerned” (Mann and Thompson 1986:71-
72)
RASTA is motivated by a functional perspective on language: a writer
manipulates elements of form (morphology, syntax, lexical choice) to achieve a
desired effect, including a desired rhetorical effect. Therefore, strewn
throughout a text we can expect to find cues to the writer’s rhetorical
intentions. As Mann and Thompson observe, there might not be a one-to-one
relationship between these cues and rhetorical relations. RASTA’s use of cues
allows for a less direct mapping between elements of form and rhetorical
relations: a given cue can indicate several relations, and a given relation can be
identified by the convergence of a cluster of cues. Furthermore, the notion that
the correspondence between formal cues and rhetorical relations is probabilistic
rather than deterministic is encoded in RASTA in the numerical weights
associated with the cues.
RASTA draws on a rich syntactic analysis in considering formal cues,
giving it an advantage over more superficial analyses of texts (e.g. Sumita et al.
1992; Ono et al. 1994; Marcu 1997a). The identification of terminal nodes for
an RST analysis, for example, which proves so difficult for a simple pattern-
matching method (section 4.2.6), is a simple affair given a syntactic analysis
and criteria based on that analysis (section 7.3). Similarly, confusion about
whether to analyze some strings as cue phrases treated as single lexical items or
as segments with internal constituency is resolved by a full syntactic analysis
(section 4.2.6). Finally, this rich syntactic analysis allows RASTA to make
surprisingly powerful deductions about syntactic correlates of rhetorical
relations. For example, a detached participial clause is usually in a
CIRCUMSTANCE relation to the main clause if it is preposed (cue H13, section
6.7.3), as illustrated in example (1), but in a RESULT relation if postposed (cue
H22, section 6.7.12), as illustrated in example (2).
(1) Leaving port on October 19 and 20, Villeneuve’s fleet was
intercepted by Nelson’s fleet on the morning of October 21.
(Trafalgar, Battle of)
(2) This bold strategy created confusion, giving the British fleet an
advantage. (Trafalgar, Battle of)
During syntactic analysis, MEG makes use of a semantic network to
resolve ambiguous syntactic dependencies (section 5.3.2). RASTA, however,
does not make any additional use of a semantic network or other external form
of knowledge representation (see section 8.3). Rather, RASTA performs its
analyses based solely on its examination of the syntactic portrait and the logical
form. RASTA is thus located at a mid-point in the spectrum mentioned above,
making use of more than cue phrases to identify rhetorical relations, but not
making reference to extrinsic knowledge.
Finally, it must be emphasized that RASTA only needs to discriminate
between a small set of relations. The cues listed in chapter 6 do not constitute
an exhaustive list of the lexical, morphological, and syntactic correlates of the
RST relations, but rather a sufficient set of criteria for distinguishing the
relations.
8.3 Representations of knowledge
In some approaches to identifying rhetorical relations, external
representations of knowledge play a crucial role. Hobbs, for example, envisages
a system that would encode “those things a speaker of English generally knows
and can expect his listener to know” (Hobbs 1979:71). Similarly Polanyi’s
(1988) Linguistic Discourse Model (section 4.5) relies crucially on amorphous
real-world knowledge and unspecified inferential processes. There is no doubt
that people draw on vast resources about events and entities or knowledge of
genre conventions in addition to formal cues to discourse structure. Any
attempt to mimic current understanding of the nature of those additional
resources is, however, likely to involve great computational expense and
complexity. Although MEG contains an enormous semantic network, MINDNET
(section 5.7), RASTA does not make reference to it. A compelling reason for
RASTA to eschew MINDNET is that it is computationally expensive to reason
using such a vast resource. A more theoretical motivation, however, concerns
the nature of rhetorical relations. A writer structures a text to express intended
rhetorical relations. The relations that the writer actually chooses might not be
those that would be said to exist in the abstract. For example, reasoning in the
abstract might suggest that a causal relationship could be hypothesized between
an event of sneezing and an event of dying: sneezing is a means by which fatal
diseases can be communicated, so one person’s sneezing might cause another
person’s death. A writer describing two such events might choose to emphasize
a causal relationship. Alternatively, the writer might choose to represent the
events as merely occurring in a temporal sequence, without emphasizing
causality. Similarly, a writer might choose to indicate a causal relationship
between two events that reasoning with an external relationship would not
suggest were causally related. By restricting its analyses to what is motivated
by the text, RASTA avoids spurious reasoning.
In some cases, a simple examination of MINDNET might appear to
suffice to identify a rhetorical relation. Mann and Thompson (1988:273)
observe that in an ELABORATION relation, there is often one of the following
relations between an element in the nucleus and an element in the satellite:
set/member, abstract/instance, whole/part, process/step, object/attribute,
generalization/specific. A whole/part relation holds between clauses 3 and 4 in
Figure 118, as can be deduced by the following chain of reasoning: aardwolves
are animals; animals have bodies; bodies have forefeet; forefeet have toes;
therefore aardwolves have toes. Although a chain of reasoning like this could
easily be performed using MINDNET, the presence of the cue phrase For
example obviates such elaborate reasoning.
1. The aardwolf is classified as Proteles cristatus.
71. It is usually placed in the hyena family, Hyaenidae.
72.Some experts, however, place the aardwolf in a separate
family, Protelidae, because of certain anatomical differences
between the aardwolf and the hyena
73.For example, the aardwolf has five toes on its forefeet…
Figure 118 Aardwolf
If future research should exhaust the possibilities for identifying
superficial cues to discourse structure, it may well prove necessary to examine
MINDNET to identify unclear rhetorical relations. Should that prove necessary,
it would be desirable to constrain the use of MINDNET, examining it in ways
suggested by the linguistic form.
8.4 Constructing and evaluating trees
As discussed above in section 4.2.6 and section 4.2.5, there are two
major problems for a discourse component attempting to construct RST
representations of the structure of a text:
1. How can combinatorial explosion be avoided? As more and more
relations are posited, the number of well-formed RST trees
compatible with those relations increases exponentially.
2. How can alternative analyses be evaluated?
Marcu (1996, 1997a), the most complete description in the literature of
an algorithm for constructing RST representations, does not address (1).
Concerning (2), Marcu suggests a metric that favors right-branching trees. This
metric, however, only appears to be valid for certain genres (section 4.2.6).
In RASTA, the solution to both (1) and (2) lies in the use of heuristic
scores associated with cues to discourse structure. The relations with the
highest heuristic scores are applied first in an effort to construct an RST tree
(chapter 7). Since better trees are produced first, the algorithm does not need to
produce all possible trees, thus avoiding exponential explosion. Finally, the
metric used in RASTA to evaluate trees is independent of genre—the heuristic
score associated with a tree is computed from the heuristic scores of the
relations used in constructing the tree. “Better” trees are those built from better
hypotheses. Although this metric for evaluating trees is independent of genre,
the methods for determining the discourse relations might vary with genre (see
section 8.5).
8.5 Genre
Despite being limited to a single genre, namely encyclopedia articles,
the work described in this dissertation represents a more explicit and complete
implementation of a discourse processing model than any hitherto described in
the literature.
The particular set of relations used in this study (section 3.6) was
motivated by the encyclopedia genre. As noted in section 3.6, articles in
Encarta are primarily concerned with the organization of information
according to ideational and textual relations subordinated to an overarching
speech act like DESCRIBE or EXPLAIN. For other genres, a slightly different set
of relations might be needed. The effectiveness of the techniques employed in
RASTA for identifying rhetorical relations lead us to expect that linguistic cues
could be identified for similar relations in other genres. The methods for
constructing an RST tree, given a set of hypothesized relations, are not
constrained in any way by genre.
Perhaps the biggest stumbling block to applying RASTA to other genres
is the potential clash discussed in section 3.6 between intentional and
informational representations of a text (Ford 1986; Moore and Pollack 1992).
This obstacle could perhaps be overcome by constructing alternative analyses.
From the perspective of Systemic Functional Grammar (Halliday 1985), for
example, a text could be said to have both informational and interpersonal
aspects. These different aspects could be modeled independently. In many cases
the resulting analyses would converge in identifying discourse constituents.
Systematic divergences would constitute a rich area for future study on the
nature of discourse.
Finally, in this study only small excerpts have been considered,
typically a paragraph in size or smaller. This limitation has been for
performance reasons and because it is easier for a human analyst to verify the
analyses produced. It is claimed within RST that the same analytical framework
can be applied equally well to excerpts of one or two sentences as to much
larger excerpts.
9. Potential Applications for RASTA
9.1 Introduction
Given the ability to automatically construct plausible representations of
discourse structure, many exciting areas of research become possible. In this
chapter I make brief mention of a few areas: text summarization, the creation of
semantic networks, information retrieval, and the quantitative analysis of
discourse patterns.
9.2 Text summarization
Mann and Thompson (1988:266-268), in discussing the notion of
nuclearity, note that deletion of nuclear text spans will tend to make a text
incoherent, but deletion of satellite text spans does not result in such
incoherence. They claim
“If units that only function as satellites and never as nuclei are
deleted, we should still have a coherent text with a message
resembling that of the original; it should be something like a
synopsis of the original text. If, however, we delete all units that
function as nuclei anywhere in the text, the result should be
incoherent and the central message difficult or impossible to
comprehend.” (Mann and Thompson 1988:267-268)
Human-generated summaries frequently involve not only the deletion of
material but also a reformulation of the content, paraphrasing the output to
maintain coherence. Such reformulation and paraphrasing are difficult to
implement within computer natural language generation systems. A method
that would produce a reasonable, coherent synopsis of a text simply by omitting
less nucleic material therefore holds considerable appeal. Ono et al. (1994)
sketchily describe an RST-based summarization method whose base level units
are sentences. Marcu (1997b) provides a full description of a text summarizer
that consists of a simple tree-traversal algorithm that prunes nodes from an RST
tree. Nodes that are not pruned constitute the summary. Marcu claims that his
method has a granularity at the level of the clause, although in fact many of his
terminal nodes are not clauses (see section 4.2.6). The algorithm that Marcu
describes would work equally well on RST trees that did not have any non-
clausal terminal nodes. In particular, Marcu’s algorithm would work well with
the output of RASTA.
I am currently experimenting with a prototype summarizer that
manipulates the output of RASTA. This prototype summarizer performs a tree
traversal in the same manner as Marcu’s algorithm, but instead of deleting
nodes, it presents nodes in a nested form. The output is presented in a hypertext
format, allowing a reader to selectively expand nodes to yield more detail.
Figure 119 illustrates a hypertext view within the Microsoft Word 97
wordprocessor of the Abd-ar-Rahman excerpt discussed in section 7.5.1. In
Figure 119, the reader has decided to expand the text subordinate to the third of
the narrative clauses, His army met the Franks…. The plus sign beside a node
indicates that that a node can be expanded, i.e. that that node has one or more
satellites. A minus sign indicates that a node cannot be expanded, either
because its satellites are already displayed or because it does not have any
satellites. In this example, there are no instances of satellites with the same
relation to a nucleus being grouped together.
Figure 119 Hypertext view of Abd-ar-Rahman text
Satellite nodes that are in the same rhetorical relation to a nucleus could
be grouped together, thereby imposing additional structure on the output.
Figure 120 gives the RST analysis for a small excerpt.
1. The acute form of conjunctivitis is commonly called pinkeye.
74. It can be caused by either bacterial or viral infection
75.and is often epidemic.
76. In newborn babies it may result from several kinds of cocci,
especially the gonococcus (gonorrheal conjunctivitis), or from
a strain of the parasitic bacterium Chlamydia trachomatis
(inclusive conjunctivitis).
Figure 120 Conjunctivitis
In the excerpt illustrated in Figure 120, terminal nodes 2 and 4 are both
in a cause relation to node 1. These discontiguous nodes could be grouped and
displayed under a single heading in a hypertext viewer, as illustrated in Figure
121. The information given in square brackets represents a hypertext node that
a reader could click on to expand it and read the text that it represents. The
hypertext node includes a count of the number of clauses beneath it, to give the
reader an indication of how much content lies beneath. (The description “More
information” has been used as a synonym for the technical term
ELABORATION.)
The acute form of conjunctivitis is commonly called pinkeye.
[Causes:2]
[More information:1]
Figure 121 Hypertext view of conjunctivitis text
Clicking on the hypertext node [Clauses:2] would cause the text of that
node to be displayed, as illustrated in Figure 122.
The acute form of conjunctivitis is commonly called pinkeye.
[Causes:2]
1. It can be caused by either bacterial or viral
infection
2. In newborn babies it may result from several
kinds of cocci, especially the gonococcus
(gonorrheal conjunctivitis), or from a strain of
the parasitic bacterium Chlamydia trachomatis
(inclusive conjunctivitis).
[More information:1]
Figure 122 Hypertext view of conjunctivitis text
Clearly, these methods of displaying a summary of a text to the user in
such a way as to enable the user to explore in more detail areas of interest
require further research and user interface testing.
9.3 The creation of semantic networks
A semantic network like MINDNET (section 5) can be constructed by
automatically extracting information from single sentences in a lexicon (see for
example Jensen and Binot 1987; Klavans et al. 1993; Monetmagni and
Vanderwende 1993; Dolan 1995; Dolan et al. 1993; Richardson 1997;
Vanderwende 1995a, 1995b). By applying domain-specific rules to extract
information from a discourse representation, similar information could be
extracted from extended text. For example, in a description of an animal in an
article in Encarta 96 there is often a section which lists the body parts of the
animal, a section on reproduction and a section on the animal’s life cycle.
9.4 Information retrieval
The field of information retrieval is dominated by statistical techniques,
which rate documents according to how closely the words in them match words
in a search query. Frequently, however, documents selected by these statistical
techniques as containing relevant words with statistically interesting
frequencies are not about the topic described by the search query. These ratings
could be improved by biasing the statistical weighting in favor of material
which occurs in more nucleic sections of text, since nucleic material is most
centrally involved in realizing the writer’s communicative goals (section 3.2).
A reliable discourse processing component could also be used during
the display of the documents returned in response to a search query to highlight
the section of the document which is most relevant to the database query.
Rather than using crude techniques like displaying the text that occurs two or
three lines before and after key terms from the search query, it would be
possible to display the text of the coherent RST subtree that contains those key
terms.
9.5 Quantitative analysis of discourse patterns
For the discourse linguist, perhaps the most exciting potential use of a
computational discourse analysis component is to enable further study of
discourse itself. The labor required to perform an RST analysis of a text is a
serious impediment to research that would take RST analyses as the basis for
higher-level generalizations. If this tedium could be relieved by an automated
computational analysis, the linguist would be freed to consider issues such as
the correlation of discourse structure with genre, the frequency of specific
rhetorical relations, depth of embedding, and anaphoric usage, to name but a
handful of potential areas of study.
Quantitative results from such study could be used to improve the
efficiency of the computational discourse analysis component itself. Richardson
(1994) describes a technique for improving the performance of a rule-based
syntactic sentence parser. During training, the parser is run over many
sentences, gathering statistics about the rules that ultimately resulted in good
parses. The parser then incorporates the results of the training session. The
syntactic rules that most often led to good parses are applied first, causing a
demonstrable improvement in the performance of the parser in converging on
good parses. In a similar fashion, RASTA could eventually be trained by
automatically processing many excerpts and gathering statistics about preferred
relations and observing the most common configurations. The information thus
obtained could be used after training to guide RASTA in much the same way as
the intuitively formulated restrictions on “thinking flow” of Sumita et al.
(1992).
10. Conclusion
This dissertation has described RASTA, a discourse processing
component that computes representations of the structure of written discourse.
RASTA builds on previous research in the field of computational discourse
analysis within the framework of RST, most notably Marcu (1996, 1997a).
RASTA directly addresses the problem of combinatorial explosion—as
more rhetorical relations are hypothesized as connecting two clauses, the
number of well-formed RST analyses for a text increases exponentially. RASTA
manages this combinatorial explosion by assigning heuristic scores to the
relations hypothesized and using those scores to guide it in constructing trees.
More likely hypotheses are applied first in a bottom-up algorithm that links
together contiguous text spans. Those same heuristic scores provide a genre-
independent method for evaluating trees—better trees are those that were
formed by the application of more likely rhetorical relations.
In this dissertation, the actual formal cues used by RASTA to identify
discourse structure have been described. Cue words and phrases are an
important source of information in RASTA, as they would no doubt be in any
discourse analysis component. RASTA is however unusual in the field of
discourse research in the extent to which it is able to recognize cues to
discourse structure by analyzing syntactic analyses and logical forms.
A relatively uncontroversial set of thirteen rhetorical relations has
proven sufficient for the analysis of articles in Encarta. The techniques for
identifying relations would still be applicable, however, if a slightly different
set of rhetorical relations were used. The efficient techniques for constructing
RST trees on the basis of a set of hypothesized rhetorical relations would not
require any modification should a different set of relations be used. The issue
of a suitable taxonomy of discourse relations, so central to work in natural
language generation (see section 3.5), was found to be unimportant for the task
of identifying discourse relations. RASTA is able to reliably discriminate among
the thirteen relations employed using a rudimentary classification of the RST
relations as either symmetric or asymmetric.
For the sake of focusing in depth on issues of efficiency, the research
described here has been limited to the text of Encarta. The search for cues in
the actual text of Encarta, without reference to a semantic network or reference
to models of real world knowledge, has been very successful. In extending
RASTA to other genres, it may prove necessary to use information beyond that
available from the syntactic analysis and logical form. A careful search for cues
in the form of the text in other genres ought however to prove amply
rewarding.
Finally, I have outlined some directions for what is surely developing
into an exciting research area.
Bibliography
ACL = Association of Computational Linguistics
ISI/RS = Information Sciences Institute Report Series
Ballard, D., R. Conrad and R. Longacre. 1971. “The deep and surface
grammar of interclausal relations.” Foundations of Language 4:70-
118.
Dolan, William B. 1995. “Metaphor as an emergent property of machine-
readable dictionaries.” In Proceedings of the AAAI 1995 Spring
Symposium Series, date??, Stanford, California, 1995. 27-32.
Dolan, William, Lucy Vanderwende and Stephen D. Richardson. 1993.
“Automatically deriving structured knowledge bases from on-line
dictionaries.” In Proceedings of the Pacific Association for
Computational Linguistics, April 21-24, 1993, Vancouver, British
Columbia. 5-14.
Ford, Cécilia E. 1986. “Overlapping relations in text structure.” In
DeLancey, Scott and Russell S. Tomlin (eds.), Proceedings of the
Second Annual Meeting of the Pacific Linguistics Conference. 107-
123.
Fox, Barbara A. 1987. Discourse Structure and Anaphora. Cambridge
Studies in Linguistics 48. Cambridge: Cambridge University Press.
Fukumoto, Jun’ichi and Jun’ichi Tsujii. 1994. “Breaking down rhetorical
relations for the purpose of analyzing discourse structures.” COLING
94: The 15th International Conference on Computational Linguistics,
August 5-9, 1994, Kyoto, Japan. Proceedings, vol. 2:1177-1183.
Haiman, John and Sandra A. Thompson, (eds.). 1988. Clause Combining in
Grammar and Discourse. John Benjamins: Amsterdam and
Philadelphia.
Haiman, John. 1980. “The Iconicity of Grammar: Isomorphism and
Motivation.” Language 56:515-540.
Halliday, M.A.K. 1985. An Introduction to Functional Grammar. Edward
Arnold Press: Baltimore.
Halliday, M.A.K. and Ruqaiya Hasan. 1976. Cohesion in English. Longman:
London.
Heidorn, George E. 1972. “Natural language inputs to a simulation
programming system.” Ph.D. dissertation, Yale University. (Also
published as Technical Report NPS-55HD72101A. Naval
Postgraduate School: Monterey.)
Hobbs, J. R. 1979. “Coherence and coreference.” Cognitive Science 3:67-90.
Houghton Mifflin. 1992. The American Heritage Dictionary of the English
Language, Third Edition. Houghton Mifflin: Boston.
Hovy, Eduard H. 1988. Planning Coherent Multisentential Text. ISI/RS-88-
208. Reprinted from Proceedings of the 26th Meeting of the ACL,
Buffalo, New York, 1988.
Hovy, Eduard H. 1990. “Parsimonious and profligate approaches to the
question of discourse structure relations.” In Proceedings of the 5th
International Workshop on Natural Language Generation,
Pittsburgh. 128-136.
Jensen, K. and J.L. Binot. 1987. “Disambiguating prepositional phrase
attachments by using on-line dictionary definitions.” Computational
Linguistics 13:251-260.
Jensen, Karen, George Heidorn and Stephen Richardson (eds.). 1993.
Natural Language Processing: The PLNLP Approach. Kluwer:
Boston/Dordrecht/London.
Klavans, Judith, Martin Chodorow and Nina Wacholder. 1993. “Building a
knowledge base from parsed definitions.” In Jensen, Heidorn and
Richardson (1993). 119-133.
Knott, Alistair and Robert Dale. 1995. “Using linguistic phenomena to
motivate a set of coherence relations.” Discourse Processes 18:35-
62.
Kuhn, H.P. 1958. “The automatic creation of literature abstracts.” IBM
Journal, April 1958:159-165.
Kurohashi, Sadao and Makoto Nagao. 1994. “Automatic detection of
discourse structure by checking surface information in sentences.”
COLING 94: The 15th International Conference on Computational
Linguistics, August 5-9, 1994, Kyoto, Japan. Proceedings, vol.
2:1123-1127.
Labov, William. 1972. Language in the Inner City: Studies in the Black
English Vernacular—Conduct and Communication. Philadelphia:
University of Pennsylvania Press.
Litman, Diane J. and Rebecca J. Passonneau. 1995. "Combining multiple
knowledge sources for discourse segmentation." In Proceedings of
the 33rd Meeting, 26-30 June, Massachusetts Institute of
Technology, Cambridge Massacheutts, USA. Association for
Computational Linguistics. 108-115.
Longacre, R. 1976. An Anatomy of Speech Notions. Ghent: The Peter de
Ridder Press.
Maier, Elisabeth and Eduard H. Hovy. 1991. “A metafunctionally motivated
taxonomy for discourse structure relations.” ms.
Mann, William C. and Sandra A. Thompson 1986. ‘Relational Propositions
in Discourse’. Discourse Processes 9:57-90. Also available as
Information Sciences Institute Research Report 83-115, 4676
Admiralty Way, Marina del Rey, CA 90292-6695.
Mann, William C. and Sandra A. Thompson. 1987. Rhetorical Structure
Theory: A theory of text organization. ISI/RS-87-190.
Mann, William C. and Sandra A. Thompson. 1988. ‘Rhetorical Structure
Theory: Toward a functional theory of text organization.’. Text
8:243-281. (Also published as Mann and Thompson 1987).
Marcu, Daniel. 1996. “Building Up Rhetorical Structure Trees.” In
Proceedings of the Thirteenth National Conference on Artificial
Intelligence, vol. 2. 1069-1074, Portland, Oregon, August 1996.
Marcu, Daniel. 1997a. “The rhetorical parsing of natural language texts.” In
Proceedings of the Thirty-fifth Annual Meeting of the Association for
Computational Linguistics. 96-103.
Marcu, Daniel. 1997b. “From discourse structures to text summaries.” In
Proceedings of the ACL ‘97/EACL ’97 Workshop on Intelligent
Scalable Text Summarization, Madrid, Spain, July 11, 1997. 82-88.
Marcus, Mitchell P., Beatrice Santorini and Mary Ann Marcinkiewicz. 1993.
“Building a large annotated corpus of English: The Penn Treebank.”
Computational Linguistics 19:313-330.
Matthiessen, Christian and Thompson, Sandra A. 1988. “The structure of
discourse and ‘subordination’.” In Haiman and Thompson (eds.).
1988:275-329
McKeown, K.R. 1985. Text Generation: Using Discourse Strategies and
Focus Constraints to Generate Natural Language Text. Cambridge
University Press: Cambridge.
Microsoft Corporation. 1995. Encarta® 96 Encyclopedia. Microsoft:
Redmond.
Montemagni, Simonetta and Lucy Vanderwende. 1993. “Structural patterns
versus string patterns for extracting semantic information from
dictionaries.” In Jensen, Heidorn and Richardson (eds.), 1993. 149-
159.
Moore, Johanna D. and Martha E. Pollack. 1992. “A problem for RST: The
need for multi-level discourse analysis.” Computational Linguistics
18:537-544
Ono, Kenji, Kazuo Sumita and Seiji Miike. 1994. “Abstract generation
based on rhetorical structure extraction.” COLING 94: The 15th
International Conference on Computational Linguistics, August 5-9,
1994, Kyoto, Japan. Proceedings, vol. 1:344-348
Pentheroudakis, Joseph and Lucy Vanderwende. 1993. “Automatically
identifying morphological relations in machine-readable
dictionaries.” Making Sense of Words: Ninth Annual Conference for
the UW Centre for the New OED and Text Research, September 27-
28, 1993. Oxford, England. 114-131.
Polanyi, Livia. 1982. “Linguistic and social constraints on storytelling.”
Journal of Pragmatics 6:509-524.
Polanyi, Livia. 1988. “A formal model of the structure of discourse.”
Journal of Pragmatics 12:601-638.
Proctor, P. (ed.). 1978. Longman Dictionary of Contemporary English.
London: Longman Group.
Redeker, Gisela. 1990. “Ideational and pragmatic markers of discourse
structure.” Journal of Pragmatics 14:367-381.
Richardson, Stephen D. 1994. “Bootstrapping statistical processing into a
rule-based natural language parser.” In The Balancing Act:
Combining Symbolic and Statistical Approaches to Language.
Proceedings of the Workshop, Las Cruces, New Mexico. pp 96-103.
Richardson, Stephen D. 1997. Determining Similarity and Inferring
Relations in a Lexical Knowledge Base. Ph.D. dissertation. The City
University of New York.
Richardson, Stephen D., Lucy Vanderwende and William Dolan. 1993.
“Combining dictionary-based methods for natural language
analysis.” In Proceedings of the TMI-93, Kyoto, Japan. 69-79.
Sanders, Ted J.M. 1992. Discourse Structure and Coherence: Aspects of a
Cognitive Theory of Discourse Representation. Lundegem:
Nevelland.
Sanders, Ted J.M. and Carel van Wijk. 1996. “PISA: A Procedure for
analyzing the structure of explanatory texts.” Text 16:91-132.
Sanders, Ted J.M., W.P.M Spooren and L.G.M. Noordman. 1992. “Toward
a taxonomy of coherence relations.” Discourse Processes 15:1-35
Sanders, Ted J.M., W.P.M Spooren and L.G.M. Noordman. 1993.
“Coherence relations in a cognitive theory of discourse
representation.” Cognitive Linguistics 4:93-133.
Sidner, Candace L. 1983. “Focusing and discourse.” Discourse Processes
6:107-130.
Sumita, K., K. Ono, T. Chino, T. Ukita, and S. Amano. 1992. “A discourse
structure analyzer for Japanese text.” In Proceedings of the
International Conference of Fifth Generation Computer Systems,
1992. 1133-1140.
Thompson, Sandra A. 1983. ‘Grammar and Discourse: The English
Detached Participial Clause.’ In Klein-Andreu, Flora (ed.),
Discourse perspectives on syntax. Academic Press: New York. 43-
65.
Vander Linden, Keith. 1993. Speaking of Actions: Choosing Rhetorical
Status and Grammatical Form in Instructional Text Generation.
Ph.D. dissertation. University of Boulder, Colorado. Published as
Technical Report CU-CS-???-93, University of Boulder, Colorado.
Vanderwende, Lucy H. 1995a. The Analysis of Noun Sequences Using
Semantic Information Extracted from On-Line Dictionaries. Ph.D.
dissertation, Georgetown University.
Vanderwende, Lucy H. 1995b. “Ambiguity in the acquisition of lexical
information.” In Proceedings of the AAAI 1995 Spring Symposium
Series, working notes of the symposium on representation and
acquisition of lexical knowledge, 174-179.
Wu, Horng Jyh P. and Steven L. Lytinen. 1990. ‘Coherence relation
reasoning in persuasive discourse.’ In Proceedings of the Twelfth
Annual Conference of the Cognitive Science Society. 503-510.
Zechner, Klaus. 1996. “Fast generation of abstracts from general domain
text corpora by extracting relevant sentences.” In COLING 96: The
16th International Conference on Computational Linguistics, August
5-9, 1996, Copenhagen, Denmark. Proceedings, vol. 2:986-989.