Upload
mae-powers
View
217
Download
0
Embed Size (px)
Citation preview
Types of references
evocative references
-evocative resolution processes:- an anaphor may be resolved to a referent that is not linearly the closest, but only hierarchically the closest - based on associations (pattern matching on morpho-semantic features) - fast- give fluency to the text
Types of references
- post-evocative resolution processes:- are inferential processes developed in memory, - computationally and cognitively slow (compel to more inference load),- require more powerful referencing means (like proper nouns), - are less frequent.
post-evocative references
Domain of evocative accessibility (DEA)
dea(u) = pref(u, vein(u))
Remind! The vein expression of a terminal node (discourse unit): the sequence of units that are required to understand just that unit, in the context of the whole discourse.
(simplified)
Heads and veins
H=3
H=1 2
H=3H=1
H=2 H=3 H=4
H=5
H=3
1 2 3 4
5
V=3 5
V=3V=3
V=1 2 3
V=1 2 3
V=1 2 3
V=(1 2) 3
V=(1 2) 3 V=3 4
The reason why she can refer Mary but not John’s mother
1 John told Mary that he loves her.2. He has never been married 3. and lived until his 40s with his mother. 4. She, on the contrary, was married twice.
antithesis
14
4
2
elaboration
3
1
elaboration 1
2 3
4
V=1 2 4
The reason why we recuperate with difficulty the antecedent of it
1. With one year before finishing his mandate as president of the company,
2. Mr. W. Ross has begun to bring about its bankruptcy. 3. There were rumors that he has obtained it by fraud.
13
circumstance
21
background
3
1
2 3
V=2 3
… while here the reference is immediate
1. Mr. W. Ross has begun to bring about the bankruptcy of his company.
2. with one year before finishing his mandate as president.3. There were rumors that he has obtained it by fraud.
13
2
background
3
1
circumstance
1
2 3
V=1 2 3
Experiment 1: evocative vs post-evocative references
Source No. of units
Total no. of refs
On the veins
Outside the veins
English 62 97 91.70% 8.30%
French 48 110 99.10% 0.90%
Romanian 66 111 95.50% 4.50%
Total 176 318 95.60% 4.40%
The 4.4% exceptions
decreasing evoking
power
Type of RE VT
pragmatic 56.30%
proper nouns 22.70%
common nouns 16.00%
pronouns 5.00%
Experiment 2: potential to establish correct co-reference links
• Compare Linear-k and Discourse-VT-k models:– For each k, each re, and each model M
(Linear or VT)• p(M-k,re,DEAk) =
• p(M-k,Corpus) = re Corpus p(M-k,re,DEAk)
1, re can be resolved to antecedents in DEAk
0, otherwise.{
Potentials
70.00%
75.00%
80.00%
85.00%
90.00%
95.00%
0 1 2 3 4 5 6 7 8 9E - D E A s i z e
VT-k Linear-k
Experiment 3: the effort required to find antecedents
• Compare Linear-k and Discourse-VT-k models:– For each k, each re, and each model M
(Linear or VT)• e(M-k,re,DEAk) =
• e(M-k,Corpus) = re Corpus e(M-k,re,DEAk)
d<k, the distance between re and the closest antecedent in DEAk
k, if no such antecedent exists.{
Effort: an example
Michael D. Casey
Genetic Therapy Inc.
Mr. Casey
Genetic Therapy Inc.
Mr. Casey
the smaller company
Johnson & Johnson
M. James Barett
chairman
its president
its
J&J
Mr. Casey
J&J
Mr. Barett
CEO
2 3 4 5 6 7 81 9
1. Michael D. Casey, a top Johnson&Johnson manager, moved to Genetic Therapy Inc., a small biotechnology concern here,
2. to become its president and chief operating officer.
3. Mr. Casey, 46 years old, was president of J&J's McNeil Pharmaceutical subsidiary,
4. which was merged with another J&J unit, Ortho Pharmaceutical Corp., this year in a cost-cutting move.
5. Mr. Casey succeeds M. James Barrett, 50, as president of Genetic Therapy.
6. Mr. Barrett remains chief executive officer
7. and becomes chairman.8. Mr. Casey said 9. he made the move to the
smaller company.
The account of VT on coherence
• Veins give a natural way to generalize Centering from local to global
Centering Rule 2: transitions
Cb(u) = Cb(u-1) Cb(u) Cb(u-1)
Cb(u) = Cp(u)
Cb(u) Cp(u)
CONTINUING SMOOTH SHIFT
RETAINING ABRUPT SHIFT
CON > RET > SSH > ASH
1 2 3 4
5 V=1 3 5
V=1 3 5
V=1 2 3 5 V=1 3 5V=1 3 4 5
Vein expressions give „lines of argumentation“
1. John sold his bicycle
1. John sold his bicycle
3. He obtained a good price for it,
5. Therefore he decided to use the money to go on a trip.
1. John sold his bicycle2. although Bill would have wanted it3. He obtained a good price for it,4. which Bill could not have afforded5. Therefore he decided to use the money to go on a trip.
1 2 3 4
5 V=1 3 5
V=1 3 5
V=1 2 3 5 V=1 3 5V=1 3 4 5
Lines of argumentation
2. although Bill would have wanted it.
1. John sold his bicycle
2. although Bill would have wanted it
3. He obtained a good price for it,
5. Therefore he decided to use the money to go on a trip.
1 2 3 4
5 V=1 3 5
V=1 3 5
V=1 2 3 5 V=1 3 5V=1 3 4 5
Lines of argumentation
3. He obtained a good price for it,
1. John sold his bicycle
3. He obtained a good price for it,
5. Therefore he decided to use the money to go on a trip.
1 2 3 4
5 V=1 3 5
V=1 3 5
V=1 2 3 5 V=1 3 5V=1 3 4 5
Lines of argumentation
4. which Bill could not have afforded.
1. John sold his bicycle
3. He obtained a good price for it,
4. which Bill could not have afforded
5. Therefore he decided to use the money to go on a trip.
1 2 3 4
5 V=1 3 5
V=1 3 5
V=1 2 3 5 V=1 3 5V=1 3 4 5
Lines of argumentation
5. Therefore he decided to use the money to go on a trip.
1. John sold his bicycle
3. He obtained a good price for it,
5. Therefore he decided to use the money to go on a trip.
Computation of longest argumentation lines (al)
u V(u) dea(u) al
1 1 3 5 1
2 1 2 3 5 1 2 1 2
3 1 3 5 1 3
4 1 3 4 5 1 3 4 1 3 4
5 1 3 5 1 3 5 1 3 5
Evaluating the coherence of a discourse
• A smoothness score:– CONTINUING = 4– RETAINING = 3– SMOOTH SHIFT =2– ABRUPT SHIFT = 1– NO Cb = 0
• A global smoothness score: summing up the score of all units
The second conjecture (on coherence)
• The global smoothness score of a discourse when computed following VT is at least as high as the score computed following CT.
• But segments, as considered by Centering, typically are developed along veins.
• When passing segments frontiers, in a linear reading, transitions are usually abrupt.
• Therefore, what we claim here is that long-distance transitions, as computed along veins, are systematically smoother than accidental transitions at segment boundaries.
Transitions and scores on a linear adjacency metric
J = [John], b = [John's bicycle], B = [Bill], p = [price], m = [the money], t = [a trip])
1 2 3 4 5
Cf J, b B, b J, p, b p, B J, m, t
Cb J b b p -
Trans ASH RET SSH No Cb
Score 1 3 2 0
Global 6/4 = 1.5
Transitions and scores on a hierarchical adjacency metric
1 2
Cf J, b B, b
Cb J b
Trans ASH
Score 1
Global
1 3 4
J, b J, p, b p, B
J J p
CON SSH
4 2
1 3 5
J, b J, p, b J, m, t
J J J
CON
4
11/4=2.75
Verifying the second conjecture
Source No. of transitions
CT score Average CT score
per transition
VT score Average VT score
per transition
English 59 76 1.25 84 1.38
French 47 109 2.35 116 2.47
Romanian 65 142 2.18 152 2.34
Total 173 327 1.89 352 2.03
VT referencesCristea,D.; Ide,N.; Romary,L. (1998): Veins Theory. An Approach to Global
Cohesion and Coherence. In Proceedings of Coling/ACL ‘98, Montreal
Cristea,D., Ide,N., Marcu,D., Tablan, M.-V. (2000): Discourse Structure and Co-Reference: An Empirical Study, In Proceedings of The 18th International Conference on Computational Linguistics COLING'2000, Luxembourg
Ide,N., Cristea,D. (2000): A Hierarchical Account of Referential Accessibility. In Proceedings of The 38th Annual Meeting of the Association for Computational Linguistics, ACL'2000, Hong Kong
Sereţan,V., Cristea,D. (2002): The use of referential constrains in structuring discourse. In Proceedings of The Third International Conference on Language Resources and Evaluation, LREC-2002, Las Palmas
Cristea, D. (2005): Motivations and Implications of Veins Theory, in B.Sharp (Ed.). Natural Language Understanding and Cognitive Science, Proceedings of the 2nd International Workshop on Natural Language Understanding and Cognitive Scienc3, NLUCS 2005, in conjunction with ICEIS 2005, Miami, U.S.A., May 2005, INSTICC Press