Upload
lidia-pivovarova
View
953
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Automatic Text SummarizationKatja Filippova
EML Research gGmbH
TU Darmstadt
Text Summarization – 25.02.2009 – p. 1
Text summarization
• A summary is a text that is produced from one or moretexts, that contains a significant portion of the information inthe original text(s), and that is no longer than half of theoriginal text(s) (Hovy, 2003)
• information retrieval• stock market prediction• generation of abstracts• online news summarization• ...
Text Summarization – 25.02.2009 – p. 2
Overview
• Introduction• classification of summarization systems• abstraction vs. extraction
• Text cohesion and coherence for summarization• graph based methods• discourse structure based methods
• Document Understanding Conference• tasks• an example
• Research directions• sentence fusion and compression• integrating world knowledge
Text Summarization – 25.02.2009 – p. 3
Text summarization: types
• A summary is a text that is produced from one or moretexts, that contains a significant portion of the information inthe original text(s), and that is no longer than half of theoriginal text(s) (Hovy, 2003)
• Indicative➠ indicates types of information➠ “alerts”
Text Summarization – 25.02.2009 – p. 4
Text summarization: types
• A summary is a text that is produced from one or moretexts, that contains a significant portion of the information inthe original text(s), and that is no longer than half of theoriginal text(s) (Hovy, 2003)
• Indicative➠ indicates types of information➠ “alerts”
• Informative➠ includes quantitative/qualitative information➠ “informs”
Text Summarization – 25.02.2009 – p. 4
Text summarization: types
• A summary is a text that is produced from one or moretexts, that contains a significant portion of the information inthe original text(s), and that is no longer than half of theoriginal text(s) (Hovy, 2003)
• Indicative➠ indicates types of information➠ “alerts”
• Informative➠ includes quantitative/qualitative information➠ “informs”
• Critic/evaluative➠ evaluates the content of the document Text Summarization – 25.02.2009 – p. 4
Text summarization: types
INDICATIVE
• The work of Consumer Advice Centres is examined. Theinformation sources used to support this work are reviewed.The recent closure of many CACs has seriously affected theavailability of consumer information and advice. Thecontribution that public libraries can make in enhancing theavailability of consumer information and advice both to thepublic and other agencies involved in consumer informationand advice, is discussed.
Text Summarization – 25.02.2009 – p. 5
Text summarization: types
INFORMATIVE
• An examination of the work of Consumer Advice Centresand of the information sources and support activities thatpublic libraries can offer. CACs have dealt with pre-shoppingadvice, education on consumers’ rights and complaintsabout goods and services, advising the client and oftenobtaining expert assessment. They have drawn on a widerange of information sources including case records, tradeliterature, contact files and external links. The recent closureof many CACs has seriously affected the availability ofconsumer information and advice. Libraries can cooperateclosely with advice agencies through local coordinatingcommitted, shared premises, join publicity referral and thesharing of professional expertise.
Text Summarization – 25.02.2009 – p. 5
Text summarization: types
• Source: single-document vs. multi-document➠ research paper➠ proceedings of a conference
Text Summarization – 25.02.2009 – p. 6
Text summarization: types
• Source: single-document vs. multi-document➠ research paper➠ proceedings of a conference
• Content: generic vs. query-based vs. user-focused➠ equal coverage of all major topics➠ based on a question “what are the causes of the war?”➠ users interested in chemistry
Text Summarization – 25.02.2009 – p. 6
Text summarization: types
• Source: single-document vs. multi-document➠ research paper➠ proceedings of a conference
• Content: generic vs. query-based vs. user-focused➠ equal coverage of all major topics➠ based on a question “what are the causes of the war?”➠ users interested in chemistry
• Form: extract vs. abstract➠ fragments from the document➠ newly re-written text
Text Summarization – 25.02.2009 – p. 6
Extraction vs. abstraction
How should a text summarization system proceed?
• read the documents
• understand them – builda semantic representation
• generate a summary fromthis representation
Text Summarization – 25.02.2009 – p. 7
Extraction vs. abstraction
• unfortunately, a rich semantic representation is notpossible yet
• to date, most summarization systems are extractive
• usually, extraction units are sentences
• low cost solution: could work without ontologies,complex representations, etc.
• extractive summaries are usually incoherent
• trade-off between non-redundancy and completeness
Text Summarization – 25.02.2009 – p. 8
Extraction vs. abstraction
Three sentences from related documents (Oct. 27 2009):• The Syrian foreign minister today condemned the killing of
eight civilians in a US raid as an act of "criminal and terroristaggression". (The Guardian)
• Syria accused the United States on Monday of carrying outa "terrorist aggression" after a deadly raid near its borderwith Iraq which it said killed eight civilians. (Reuters)
• Lebanese President Michel Suleiman on Monday contactedhis Syrian counterpart Bashar Assad to denounce"Sunday’s American aggression" against the Syrian villageof Abu Kamal near the border with Iraq, local Elnashrawebsite reported. (Aljazeera)
Text Summarization – 25.02.2009 – p. 9
Extraction vs. abstraction
Three sentences from related documents (Oct. 27 2009):• The Syrian foreign minister today condemned the killing of
eight civilians in a US raid as an act of "criminal and terroristaggression". (The Guardian)
• Syria accused the United States on Monday of carrying outa "terrorist aggression" after a deadly raid near its borderwith Iraq which it said killed eight civilians. (Reuters)
• Lebanese President Michel Suleiman on Monday contactedhis Syrian counterpart Bashar Assad to denounce"Sunday’s American aggression" against the Syrian villageof Abu Kamal near the border with Iraq, local Elnashrawebsite reported. (Aljazeera)
Text Summarization – 25.02.2009 – p. 9
Extraction vs. abstraction
Three sentences from related documents (Oct. 27 2009):• The Syrian foreign minister today condemned the killing of
eight civilians in a US raid as an act of "criminal and terroristaggression". (The Guardian)
• Syria accused the United States on Monday of carrying outa "terrorist aggression" after a deadly raid near its borderwith Iraq which it said killed eight civilians. (Reuters)
• Lebanese President Michel Suleiman on Monday contactedhis Syrian counterpart Bashar Assad to denounce"Sunday’s American aggression" against the Syrian villageof Abu Kamal near the border with Iraq, local Elnashrawebsite reported. (Aljazeera)
Text Summarization – 25.02.2009 – p. 9
Extraction vs. abstraction
Three sentences from related documents (Oct. 27 2009):• The Syrian foreign minister today condemned the killing of
eight civilians in a US raid as an act of "criminal and terroristaggression". (The Guardian)
• Syria accused the United States on Monday of carrying outa "terrorist aggression" after a deadly raid near its borderwith Iraq which it said killed eight civilians. (Reuters)
• Lebanese President Michel Suleiman on Monday contactedhis Syrian counterpart Bashar Assad to denounce"Sunday’s American aggression" against the Syrian villageof Abu Kamal near the border with Iraq, local Elnashrawebsite reported. (Aljazeera)
Text Summarization – 25.02.2009 – p. 9
Extraction vs. abstraction
• extractive summaries are not coherent – sentences pulledout from different documents make sense each but soundawkward when put together
Text Summarization – 25.02.2009 – p. 10
Extraction vs. abstraction
• extractive summaries are not coherent – sentences pulledout from different documents make sense each but soundawkward when put together
• unresolved pronouns may distort the meaning
Text Summarization – 25.02.2009 – p. 10
Extraction vs. abstraction
• extractive summaries are not coherent – sentences pulledout from different documents make sense each but soundawkward when put together
• unresolved pronouns may distort the meaning
• beginning with a sentence which starts with However, ... isnot a good idea
Text Summarization – 25.02.2009 – p. 10
Extraction vs. abstraction
• extractive summaries are not coherent – sentences pulledout from different documents make sense each but soundawkward when put together
• unresolved pronouns may distort the meaning
• beginning with a sentence which starts with However, ... isnot a good idea
• there is a striking difference with human generated texts –pronouns and connectives are in the right place, the flow ofdiscourse makes sense
Text Summarization – 25.02.2009 – p. 10
Extraction vs. abstraction
• extractive summaries are not coherent – sentences pulledout from different documents make sense each but soundawkward when put together
• unresolved pronouns may distort the meaning
• beginning with a sentence which starts with However, ... isnot a good idea
• there is a striking difference with human generated texts –pronouns and connectives are in the right place, the flow ofdiscourse makes sense
• How could one use this property of natural discourse forsummarization?
Text Summarization – 25.02.2009 – p. 10
Text coherence vs. text cohesion
• John enjoys playing the piano. John wants to become afamous piano player. John works hard and works hard everyday. Working hard is necessary to become a famous pianoplayer.
Text Summarization – 25.02.2009 – p. 11
Text coherence vs. text cohesion
• John enjoys playing the piano. John wants to become afamous piano player. John works hard and works hard everyday. Working hard is necessary to become a famous pianoplayer.
Text Summarization – 25.02.2009 – p. 11
Text coherence vs. text cohesion
• John enjoys playing the piano. John wants to become afamous piano player. John works hard and works hard everyday. Working hard is necessary to become a famous pianoplayer.
• John enjoys playing the piano. However, he woke up earlyyesterday. But the day before yesterday the weather waswonderful, because rain and snow started immediately andcontinued the whole day through. By the way, his teacherdid the same.
Text Summarization – 25.02.2009 – p. 11
Text coherence vs. text cohesion
• John enjoys playing the piano. John wants to become afamous piano player. John works hard and works hard everyday. Working hard is necessary to become a famous pianoplayer.
• John enjoys playing the piano. However, he woke up earlyyesterday. But the day before yesterday the weather waswonderful, because rain and snow started immediately andcontinued the whole day through. By the way, his teacherdid the same.
Text Summarization – 25.02.2009 – p. 11
Text coherence vs. text cohesion
• John enjoys playing the piano. John wants to become afamous piano player. John works hard and works hard everyday. Working hard is necessary to become a famous pianoplayer.
• John enjoys playing the piano. However, he woke up earlyyesterday. But the day before yesterday the weather waswonderful, because rain and snow started immediately andcontinued the whole day through. By the way, his teacherdid the same.
• John enjoys playing the piano and wants to become famous.He works hard and does it every day because it isnecessary for his goal.
Text Summarization – 25.02.2009 – p. 11
Text coherence vs. text cohesion
• Text coherence represents the overall structure of amulti-sentence text in terms of macro-level relationsbetween clauses or sentences (Halliday & Hasan, 1996).➠ Rhetorical Structure Theory (Mann & Thompson, 1988)➠ Discourse Representation Theory (Kamp, 1981)➠ Discourse Lexicalized Tree Adjoining Grammar (Forbes,
2001)
• John enjoys playing the piano. [John wants to become afamous piano player.] (that’s why) [John works hard andworks hard every day.] Working hard is necessary tobecome a famous piano player.
Text Summarization – 25.02.2009 – p. 12
Text coherence vs. text cohesion
• Text cohesion involves relations between words, wordsenses, or referring expressions, which determine howtightly connected the text is (Halliday & Hasan, 1996).➠ anaphora, ellipsis, connectives➠ synonymy and other lexical relations
• John enjoys playing the piano. However, he woke up earlyyesterday. But the day before yesterday the weather waswonderful, because rain and snow started immediately andcontinued the whole day through. By the way, his teacherdid the same.
Text Summarization – 25.02.2009 – p. 12
Coherence based summarization
• earlier systems considered technical documents and aimedat identifying important information by assigning weights tosentences (Luhn, 1958; Edmundson, 1969)
• several weighted features were used:➠ word (stem) frequency➠ presence of cue words (e.g., as a result, significant)
which signalize important content➠ sentence position➠ document structure
• feature weights were tuned manually
Text Summarization – 25.02.2009 – p. 13
Coherence based summarization
• Rhetorical Structure Theory (Mann & Thompson, 1987)• elaboration• example• contrast• background• motivation• etc.
"I am optimistic"said Mr. Smith
as the market plunged.
AttributionCircumstance
(from Sporleder & Lapata, 2005)Text Summarization – 25.02.2009 – p. 14
Coherence based summarization
• one could use discourse structure for summarization(Marcu, 2000)
• however, this is not done often:• there are few discourse parsers and they are not very
precise• there are arguments whether tree representation is
sufficient for discourse (Wolf & Gibson, 2005)• it is not obvious to classify rhetorical relations• some relations are argued to be anaphoric and not
discourse (Webber et al., 2003)
Text Summarization – 25.02.2009 – p. 15
Cohesion based summarization
• it is common to represent a text as a graph, where nodesare sentences and edges are some relations between them(e.g., discourse relations or just similarity)
• a common graph connectivity assumption is that the nodeswhich are connected to many other nodes are likely to carrysalient information
• it is also assumed that nodes whose removal affects thestructure of the document are important (Skorochodko, 1972from Mani, 2001)
Text Summarization – 25.02.2009 – p. 16
Cohesion based summarization
• it is common to represent a text as a graph, where nodesare sentences and edges are some relations between them(e.g., discourse relations or just similarity)
• a common graph connectivity assumption is that the nodeswhich are connected to many other nodes are likely to carrysalient information
• it is also assumed that nodes whose removal affects thestructure of the document are important (Skorochodko, 1972from Mani, 2001)
Text Summarization – 25.02.2009 – p. 16
Cohesion based summarization
• modern approaches extend this idea and use PageRank(Page & Brin, 1998) to find salient nodes (Erkan & Radev,2004; Mihalcea & Tarau, 2004) in such a graph
• similar sentences are connected(bag-of-words similarity)
Text Summarization – 25.02.2009 – p. 17
Cohesion based summarization
• modern approaches extend this idea and use PageRank(Page & Brin, 1998) to find salient nodes (Erkan & Radev,2004; Mihalcea & Tarau, 2004) in such a graph
• similar sentences are connected(bag-of-words similarity)
• a similarity threshold is used
Text Summarization – 25.02.2009 – p. 17
Cohesion based summarization
• modern approaches extend this idea and use PageRank(Page & Brin, 1998) to find salient nodes (Erkan & Radev,2004; Mihalcea & Tarau, 2004) in such a graph
• similar sentences are connected(bag-of-words similarity)
• a similarity threshold is used• the top N of page-ranked
sentences are extracted
Text Summarization – 25.02.2009 – p. 17
Coherence vs. cohesion based TS
• Coherence:+ transparent; coherence of the output can be improved– annotation of relations is still a challenge; preprocessing
difficulties
• Cohesion:+ intuitively appealing; low-cost; even unsupervized– requires WSD*, anaphora resolution; hard to pin down;
tuned thresholds
* word sense disambiguation
Text Summarization – 25.02.2009 – p. 18
DUC competitions
• Document Understanding Conferences (2000-2007)• from 2008 Text Analysis Conference (TAC)
• provide participants with- a task- data- manual and automatic evaluation
• increasing challenge in tasks: from generic single-documentsummarization to multi-document update summary (2008)
Text Summarization – 25.02.2009 – p. 19
DUC competitions
Sample topic: D0740I
round-the-world balloon flight
Report on the planning, attempts and firstsuccessful balloon circumnavigation of the earthby Bertrand Piccard and his crew.
Text Summarization – 25.02.2009 – p. 20
DUC competitions
<DOC>
<DOCNO> APW19981112.0453 </DOCNO>
<DOCTYPE> NEWS STORY </DOCTYPE>
<DATE_TIME> 11/12/1998 08:21:00 </DATE_TIME>
<HEADER> w1942 &Cx1f; wstm- r i &Cx13; &Cx11; BC-Switzerlan d-BalloonQu
11-12 0355 </HEADER>
<BODY>
<SLUG> BC-Switzerland-Balloon Quest </SLUG> <HEADLINE> S wiss challenger
prepares third attempt at global record </HEADLINE> &UR; AP Photos GEV
101-102 &QL; <TEXT> GENEVA (AP) _ Swiss balloon pilot Bertra nd Piccard
and his new teammate, British flight engineer Tony Brown, sa id Thursday
they will be ready later this month for a new attempt to fly non stop
round the world. Their new Breitling Orbiter 3 balloon will t ake off
from Chateau d’Oex, in the Swiss Alps, as soon after Nov. 25 as weather
conditions are favorable, they said. It will be Piccard’s th ird attempt
to become the first to pilot a balloon around the world. In Feb ruary
the Swiss pilot, along with British flight engineer Andy Els on andText Summarization – 25.02.2009 – p. 20
The EML NLP group at DUC 2007
Text Summarization – 25.02.2009 – p. 21
Preprocessing: Annotation
• Sentence splitting• Tokenization• PoS tagging• Chunking• Named Entities recognition
Text Summarization – 25.02.2009 – p. 22
Preprocessing: Problems
• Sentence splitting<sentence>At Pine Ridge, a scrolling marqueeat Big Bat’s Texaco expressed both joy overClinton’s visit and wariness of all theofficial attention: “Welcome PresidentClinton.</sentence> <sentence>Remember ourtreaties,” the sign read.
Text Summarization – 25.02.2009 – p. 23
Preprocessing: Problems
• Sentence splitting<sentence>At Pine Ridge, a scrolling marqueeat Big Bat’s Texaco expressed both joy overClinton’s visit and wariness of all theofficial attention: “Welcome PresidentClinton.</sentence> <sentence>Remember ourtreaties,” the sign read.
• and cleaning<sentence> PINE RIDGE, S.D.</sentence>
<sentence> (AP) - President Clinton turned theattention of his national poverty tour todayto arguably the poorest, most forgotten U.S.citizens of them all: AmericanIndians.</sentence>
Text Summarization – 25.02.2009 – p. 23
Preprocessing: Document filtering
• Match topic with document extracts• Pick the top 5 matching documents
Text Summarization – 25.02.2009 – p. 24
Semantic analysis
• Filter topic• Connect topic words with words in
document sentences• Compute sentence scores
matching wordsmatching word sequences
➠ ranked list of sentences
Text Summarization – 25.02.2009 – p. 25
Extractive summary generation
• Rerank sentences• Select the top non-redundant sentences (250 word limit)• Re-arrange sentences Text Summarization – 25.02.2009 – p. 26
A good summary
Round-the-world balloon flight: Report on the planning, attemptsand first successful balloon circumnavigation of the earth byBertrand Piccard and his crew.
Swiss balloon pilot Bertrand Piccard announced Wednesdaythat he has chosen Brian Jones as his teammate for his nextattempt at circling the world in a balloon. Jones, 52, replacesfellow British flight engineer Tony Brown. Achieving whatpromoters called the last great milestone of aviation, BertrandPiccard and Brian Jones joined legends like the Wright Brothersand Charles Lindbergh with Saturday’s completion of the firstmanned round-the-world balloon flight. At 4:54 a.m. ESTSaturday, the two balloonists crossed the line of longitude fromwhich they had departed on March 1 at Chateau D’Oex,Switzerland, ... Text Summarization – 25.02.2009 – p. 27
A bad summary
Angelina Jolie: What have been the most recent significantevents in the life and career of actress Angelina Jolie?
Angelina Jolie’s win for best supporting actress for her role in“Girl, Interrupted” came 21 years after father Jon Voight wasawarded best actor for “Coming Home.“ ANGELINA JOLIE’SLIFE ON THE EDGE After all, her career is in overdrive. ButJolie cautions that she’s still a serious actress. It’s not like I’msuddenly a better actress because I have awards or this boxoffice clout,” she says. “I am secure in the fact that I do havesomething to offer as an actress,”Jolie says. ‘...
Text Summarization – 25.02.2009 – p. 28
Evaluation
• automatic evaluation with ROUGE (Lin, 2004)
• manual evaluation with respect to➠ responsiveness➠ linguistic quality
1. grammaticality2. non-redundancy3. referential clarity4. focus5. structure and coherence
• our system scored above the average, top 5 fornon-redundancy and coherence (recall the documentfiltering stage)
Text Summarization – 25.02.2009 – p. 29
Research directions
• like in information retrieval, query expansion is expected toimprove recall➠ WordNet (Fellbaum, 1998) for similarity➠ Wikipedia for relatedness (Strube & Ponzetto, 2006)➠ paraphrases
Text Summarization – 25.02.2009 – p. 30
Research directions
• like in information retrieval, query expansion is expected toimprove recall➠ WordNet (Fellbaum, 1998) for similarity➠ Wikipedia for relatedness (Strube & Ponzetto, 2006)➠ paraphrases
• coreference resolution is needed for preprocessing,otherwise, e.g., pronouns are filtered as stopwords
Text Summarization – 25.02.2009 – p. 30
Research directions
• like in information retrieval, query expansion is expected toimprove recall➠ WordNet (Fellbaum, 1998) for similarity➠ Wikipedia for relatedness (Strube & Ponzetto, 2006)➠ paraphrases
• coreference resolution is needed for preprocessing,otherwise, e.g., pronouns are filtered as stopwords
• relevance vs. redundancy issue: in MDS, how can weensure non-redundancy of the summary? (Carbonell &Goldstein, 1998)
Text Summarization – 25.02.2009 – p. 30
Research directions
• like in information retrieval, query expansion is expected toimprove recall➠ WordNet (Fellbaum, 1998) for similarity➠ Wikipedia for relatedness (Strube & Ponzetto, 2006)➠ paraphrases
• coreference resolution is needed for preprocessing,otherwise, e.g., pronouns are filtered as stopwords
• relevance vs. redundancy issue: in MDS, how can weensure non-redundancy of the summary? (Carbonell &Goldstein, 1998)
• sentence ordering for extractive MDS (Barzilay & Lapata,2005)
Text Summarization – 25.02.2009 – p. 30
Directions of research
• abstractive summarization is a distant goal but there areways to go beyond sentence extraction➠ sentence compression➠ sentence fusion
Text Summarization – 25.02.2009 – p. 31
Sentence compression
This is true, regardless of the opinion that some people have of Syria, and oftheir unhappiness at Syria’s presence in Lebanon.
Text Summarization – 25.02.2009 – p. 32
Sentence compression
This is true, regardless of the opinion that some people have of Syria, and oftheir unhappiness at Syria’s presence in Lebanon.
Text Summarization – 25.02.2009 – p. 32
Sentence compression
This is true, regardless of the opinion that some people have of Syria, and oftheir unhappiness at Syria’s presence in Lebanon.
• summarization on the sentence level
• in principle, a compression can be different from the input(different wording and structure)
• to date, most systems use word deletion only
• meanwhile there is a compression corpus available onlinehttp://homepages.inf.ed.ac.uk/s0460084/data
• the performance can be evaluated automatically
Text Summarization – 25.02.2009 – p. 32
Sentence fusion
1 John Smith, born November 15 1900, studied chemistry and physics atthe University of London.
2 From 1917 Mr. Smith studied at the University of London and in 1921 hegraduated with distinction.
Text Summarization – 25.02.2009 – p. 33
Sentence fusion
1 John Smith, born November 15 1900, studied chemistry and physics atthe University of London.
2 From 1917 Mr. Smith studied at the University of London and in 1921 hegraduated with distinction.
➠ Mr. Smith studied chemistry and physics at the University of Londonfrom 1917.
• pieces of related sentences are used to generate a novelsentence
• can be seen as a middle ground between extractive andabstractive summarization
• addresses the incompleteness-redundancy problem
Text Summarization – 25.02.2009 – p. 33
Thank you!
(FOR YOUR ATTENTION)
Text Summarization – 25.02.2009 – p. 34
References
• R. Barzilay & M. Lapata, 2005: Modeling local coherence:An entity-based approach
• S. Brin & L. Page, 1998: The anatomy of a large-scalehypertextual web search engine
• J. G. Carbonell & J. Goldstein, 1998: The use of MMR,diversity-based reranking for reordering documents andproducing summaries
• H. P. Edmundson, 1969: New methods in automaticextracting
• G. Erkan & D. Radev, 2004: LexRank: Graph-based lexicalcentrality as salience in text summarization
• C. Fellbaum, 1998: WordNet: An electronic lexical database
Text Summarization – 25.02.2009 – p. 35
References
• K. Forbes, E. Miltsakaki, R. Prasad, A. Sarkar, A. Joshi, B.L. Webber, 2001: DLTAG system – discourse parsing with aLexicalized Tree Adjoining Grammar
• M. Halliday & R. Hasan, 1996: Cohesion in text• E. H. Hovy, 2003: Text summarization• H. Kamp, 1981: A theory of truth and semantic
representation• C.-Y. Lin, 2004: Automatic evaluation of summaries using
N-gram co-occurrence statistics• H. P. Luhn, 1958: The automatic creation of literature
abstracts• I. Mani, 2001: Automatic summarization
Text Summarization – 25.02.2009 – p. 36
References
• W. C. Mann & S. A. Thompson, 1988: Rhetorical structuretheory. Towards a functional theory of text organization
• D. Marcu, 2000: The theory and practice of discourseparsing and summarization
• R. Mihalcea & P. Tarau, 2004: TextRank: Bringing orderinto text
• E. Skorochodko, 1972: Adaptive method of automaticabstracting and indexing
• C. Sporleder & M. Lapata, 2005: Discourse chunking and itsapplication to sentence compression
• M. Strube & S. P. Ponzetto, 2006: WikiRelate! Computingsemantic relatedness using Wikipedia
Text Summarization – 25.02.2009 – p. 37
References
• B. L. Webber, M. Stone, A. Joshi, A. Knott, 2003: Anaphoraand discourse structure
• F. Wolf & E. Gibson, 2005: Representing discoursecoherence: A corpus-based study
Text Summarization – 25.02.2009 – p. 38