Automatic Summarization: A Tutorial Presented at RANLP’2003 Inderjeet Mani Georgetown University Tuesday, September 9, 2003 2-5:30 pm @georgetown.edu complingone.georgetown.edu/~linguist/inderjeet.html

Automatic Summarization:A Tutorial Presented at RANLP’2003

Inderjeet Mani

Georgetown University

Tuesday, September 9, 20032-5:30 pm

@georgetown.edu

complingone.georgetown.edu/~linguist/inderjeet.html

mailto:[email protected]

mailto:[email protected]

RANLP’2003Page 2

Copyright © 2003 Inderjeet Mani. All rights reserved.

AGENDA

14:10 pm I. Fundamentals (Definitions,

Human Abstracting, Abstract Architecture)

14:40 II. Extraction (Shallow Features, Revision,

Corpus-Based Methods)

15:30 Break

16: 00 III. Abstraction (Template and Concept-Based)

16:30 IV. Evaluation

17:00 pm V. Research Areas

Multi-document, Multimedia, Multilingual

Summarization

17:30 pm Conclusion

RANLP’2003Page 3


Human Summarization is all around us

Headlines newspapers, Headline NewsTable of contents of a book, magazine, etc.Preview of a movieDigest TV or cinema guideHighlights meeting dialogue, email trafficAbstract summary of a scientific paperBulletin weather forecast, stock market, ...Biography resume, obituary, tombstoneAbridgment Shakespeare for kidsReview of a book, a CD, play, etc.Scale-downs maps, thumbnailsSound bite/video clip from speech, conversation, trial

RANLP’2003Page 4


Current Applications

Multimedia news summaries: watch the news and tell me what happened while I was away

Physicians' aids: summarize and compare the recommended treatments for this patient

Meeting summarization: find out what happened at that teleconference I missed

Search engine hits: summarize the information in hit lists retrieved by search engines

Intelligence gathering: create a 500-word biography of Osama bin Laden

Hand-held devices: create a screen-sized summary of a book

Aids for the Handicapped: compact the text and read it out for a blind person

RANLP’2003Page 5


RANLP’2003Page 6


Example BIOGEN Biographies

Vernon Jordan is a presidential friend and a Clinton adviser. He is 63 years old. He helped Ms. Lewinsky find a job. He testified that Ms. Monica Lewinsky said that she had conversations with the president, that she talked to the president. He has numerous acquaintances, including Susan Collins, Betty Currie, Pete Domenici, Bob Graham, James Jeffords and Linda Tripp.

Henry Hyde is a Republican chairman of House Judiciary Committee and a prosecutor in Senate impeachment trial. He will lead the Judiciary Committee's impeachment review. Hyde urged his colleagues to heed their consciences , “the voice that whispers in our ear , ‘duty, duty, duty.’”.

Victor Polay is the Tupac Amaru rebels' top leader, founder and the organization's commander-and-chief. He was arrested again in 1992 and is serving a life sentence. His associates include Alberto Fujimori, Tupac Amaru Revolutionary, and Nestor Cerpa.

RANLP’2003Page 7


Columbia University’s Newsblaster

www.cs.columbia.edu/nlp/newsblaster/summaries/11_03_02_5.html

RANLP’2003Page 8


Michigan’s MEAD

RANLP’2003Page 9


Terms and Definitions

Text Summarization

- The process of distilling the most important information from a source (or sources) to produce an abridged version for a particular user (or users) and task (or tasks).

Extract vs. Abstract

- An extract is a summary consisting entirely of material copied from the input

- An abstract is a summary at least some of whose material is not present in the input, e.g., subject categories, paraphrase of content, etc.

RANLP’2003Page 10


Illustration of Extracts and Abstracts

25 Percent Extract of Gettysburg Address (sents 1, 2, 6)

Fourscore and seven years ago our fathers brought forth upon this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. The brave men, living and dead, who struggled here, have consecrated it far above our poor power to add or detract.

10 Percent Extract (sent 2}

Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure.

15 Percent Abstract

This speech by Abraham Lincoln commemorates soldiers who laid down their lives in the Battle of Gettysburg. It offers an eloquent reminder to the troops that it is the future of freedom in America that they are fighting for.

RANLP’2003Page 11


Illustration of the power of human abstracts

President Calvin Coolidge, Grace Coolidge, and dog, Rob Roy, c.1925. Plymouth Notch, Vermont.

Mrs. Coolidge: What did the preacher discuss in his sermon?President Coolidge: Sin. Mrs. Coolidge: What did he say?President Coolidge: He said he was against it.

- Bartlett’s Quotations (via Graeme Hirst)

RANLP’2003Page 12


Summary Function

Indicative summaries- An indicative abstract provides a reference function for

selecting documents for more in-depth reading.

Informative summaries- An informative abstract covers all the salient information in

the source at some level of detail.Evaluative summaries- A critical abstract evaluates the subject matter of the source,

expressing the abstractor's views on the quality of the work of the author

The indicative/informative distinction is a prescriptive distinction, intended to guide professional abstractors (e.g., ANSI 1996).

The indicative/informative distinction is a prescriptive distinction, intended to guide professional abstractors (e.g., ANSI 1996).

Indicative

Informative

evaluative

RANLP’2003Page 13


User-Oriented Summary Types

Generic summaries

- aimed at a particular - usually broad - readership community

Tailored summaries (aka user-focused, topic-focused, query-focused

summaries)

- tailored to the requirements of a particular user or group of users.

- User’s interests: full-blown user models profiles recording subject area terms a specific query.

- A user-focused summary needs, of course, to take into account the influence of the user as well as the content of the document.

A user-focused summarizer usually includes a parameter to influence this weighting.

RANLP’2003Page 14


Summarization Architecture

Summaries

Audience Function

TypeExtract

Abstract

CharacteristicsSpan

SourceGenreMedia

Language

Coherence

Compression

Analysis Transformation Synthesis

RANLP’2003Page 15


Characteristics of Summaries

Reduction of information content

- Compression Rate, also known as condensation rate, reduction rate

Measured by summary length / source length ( 0 < c < 100)

- Target Length

Informativeness

- Fidelity to Source

- Relevance to User’s Interests

Well-formedness/Coherence

- Syntactic and discourse-level Extracts: need to avoid gaps, dangling anaphors, ravaged

tables, lists, etc. Abstracts: need to produce grammatical, plausible output

RANLP’2003Page 16


Relation of Summarization to Other Tasks

Similarities Differences

DocumentRetrieval &Filtering

relevance;extraction as passageretrieval

condensation rate isn't aparameter (although outputmay avail of summarization)

TextMining

discovery procedure(multi-source summ.)

condensation rate isn't aparameter

InformationExtraction

only if doc is mainlyabout extracted info

condensation rate isn't aparameter (though it could beone); condensation rate isn'tapplicable if document isn'tabout template

TextCompression

leverages redundancyin a message tocondense it

for efficient storage andtransmission of information,not for human consumption

RANLP’2003Page 17


One Text, Many Summaries(Evaluation preview)

25 Percent Leading Text Extract (first 3 sentences) - seems OK, too!

Four score and seven years ago our fathers brought forth upon this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met here on a great battlefield of that war.

15 Percent Synopsis by human (critical summary) - seems even better!

This speech by Abraham Lincoln commemorates soldiers who laid down their lives in the Battle of Gettysburg. It offers an eloquent reminder to the troops that it is the future of freedom in America that they are fighting for.

11 Percent Extract (by human, out of context) - is bad! (sents5, 8)

It is altogether fitting and proper that we should do this. The world will little note, nor long remember, what we say here, but can never forget what they did here.

We can usually tell when a summary is incoherent, but how do we evaluate summaries in general?

We can usually tell when a summary is incoherent, but how do we evaluate summaries in general?

RANLP’2003Page 18


Studies of human summaries

Cremmins (1996) prescribed that abstractors- use surface features: headings, key phrases, position- use discourse features: overall text structure- revise and edit abstracts

Liddy (1991)- studied 276 abstracts structured in terms of background,

purpose, methodology, results and conclusions Endres-Niggemeyer et al. (1995, 1998) found abstractors

- use top-down strategy exploiting discourse structure- build topic sentences, use beginning/ends as relevant,

prefer top level segments, examine passages/paragraphs before individual sentences, exploit outlines, formatting ...

RANLP’2003Page 19


Endres-Niggemeyer et al. (1995, 1998)

Abstractors never attempt to read the document from start to finish.

Instead, they use the structural organization of the document, including formatting and layout (the scheme) to skim the document for relevant passages, which are fitted together into a discourse-level representation (the theme).

This representation uses discourse-level rhetorical relations to link relevant text elements capturing what the document is about.

They use a top-down strategy, exploiting document structure, and examining paragraphs and passages before individual sentences.

The skimming for relevant passages exploits specific shallow features such as:

- cue phrases (especially in-text summaries)

- location of information in particular structural positions (beginning of the document, beginning and end of paragraphs)

- information from the title and headings.

RANLP’2003Page 20


Stages of Abstracting: Cremmins (1996)

Cremmins recommends 12-20 mins to abstract an average scientific paper - much less time than it takes to really understand one.

Cremmins recommends 12-20 mins to abstract an average scientific paper - much less time than it takes to really understand one.

RANLP’2003Page 21


Cremmins (1996) described two kinds of editing operations that abstractors carry out

- Local Revision - revises content within a sentence- Global Revision - revises content across sentences

Abstractors’ Editing Operations: Local Revision

drop vague orredundant terms

referenceadjustment

wordingprescriptions

contextual lexical choice

RANLP’2003Page 22


AGENDA





15:30 Break





Summarization

17:30 pm Conclusion

RANLP’2003Page 23


Summarization Approaches

Shallower approaches- result in sentence extraction

sentences may/will be extracted out of context

synthesis here involves smoothing

» include window of previous sentences

» adjust references- can be trained using a corpus

Deeper approaches - result in abstracts- synthesis involves NL generation

can be partly trained using a corpus- requires some coding for a domain

RANLP’2003Page 24


Some Features used in Sentence Extraction Summaries

Location: position of term in document, position in paragraph/section, section depth, particular sections (e.g., title, introduction, conclusion)

Thematic: presence of statistically salient terms (tf.idf)- these are document-specific

Fixed phrases: in-text summary cue phrases (“in summary”, “our investigation shows”, “the purpose of this article is”,..), emphasizers (“important”, “in particular”,...)

- these are genre-specific Cohesion: connectivity of text units based on proximity,

repetition and synonymy, coreference, vocabulary overlap Discourse Structure: rhetorical structure, topic structure,

document format

RANLP’2003Page 25


Putting it Together: Linear Feature Combination

U is a text unit such as a sentence, Greek letters denote tuning parameters

Location Weight assigned to a text unit based on whether it occurs in initial, medial,

or final position in a paragraph or the entire document, or whether it occurs

in prominent sections such as the document’s intro or conclusion

FixedPhrase Weight assigned to a text unit in case fixed-phrase summary cues occur

ThematicTerm Weight assigned to a text unit due to the presence of thematic

terms (e.g., tf.idf terms) in that unit

AddTerm Weight assigned to a text unit for terms in it that are also present in the title,

headline, initial para, or the user’s profile or query

)( )(

)(*)(:)(

UAddTermUrmThematicTe

UeFixedPhrasULocationUWeight

RANLP’2003Page 26


Shallow Approaches

Source(s)

Analysis Transformation(Selection)

Summary

FeatureCombiner

F1+F2+F3

FeatureExtractor

Synthesis (Smoothing)

Sentence Selector

Sentence Revisor

FeatureExtractor

FeatureExtractor

RANLP’2003Page 27


Revision as Repair

structured environments (tables, etc.)

- recognize and exclude

- **recognize and summarize

anaphors

- exclude sentences (which begin) with anaphors

- include a window of previous sentences

- **reference adjustment

gaps

- include low-ranked sentences immediately between two selected sentences

- add first sentence of para if second or third selected

- **model rhetorical structure of source

RANLP’2003Page 28


A Simple Text Revision Algorithm

Construct initial “sentence-extraction” draft from source by picking highest weighted sentences in source until compression target is reached

Revise draft

- Use syntactic trees (using a statistical parser) augmented with coreference classes

1 Procedure Revise(draft, non-draft, rules, target-compression):

2 for each rule in rules

3 while ((compression(draft)- target-compression) < )

4 while (<x, y> := next-candidates(draft, non-draft)) # e.g., binary rule

5 result := apply-rule(rule, x, y); # returns first result which succeeds

6 draft := draft U result

RANLP’2003Page 29


Example of Sentence Revision

DeletedSalient

Aggregated

RANLP’2003Page 30


Informativeness vs. Coherence in Sentence Revision

11

11.5

12

12.5

13

13.5

14

14.5

15

15.5

IAA+E

0.3050.31

0.3150.32

0.3250.33

0.3350.34

0.3450.35

0.355

0.345

IEAA+E

> is good

A > I, A+E > I (initial draft)

A >* E, A+E >* E

> is good

A > I, A+E > I (initial draft)

A >* E, A+E >* E

< is good

A+E <* I

A >* I

< is good

A+E <* I

A >* I

Mani, Gates, and Bloedorn (ACL’99): 630 summaries from 7 systems (of 90 documents) were

revised and evaluated using vocabulary overlap measure against TIPSTER answer keys. A: Aggregation, E: Elimination

Informativeness Sentence Complexity

RANLP’2003Page 31


CORPUS-BASEDSENTENCE EXTRACTION

RANLP’2003Page 32


The Need for Corpus-Based Sentence Extraction

Importance of particular features can vary with the genre of text

- e.g., location features: newspaper stories: leading text scientific text: conclusion TV news: previews

So, there is a need for summarization techniques that are adaptive, that can be trained for different genres of text

RANLP’2003Page 33


Learning Sentence Extraction Rules

Few corpora available; labeling can be non-trivial, requiring aligning each document unit (e.g., sentence) with abstract.

Learns to extract just individual sentences (though feature vectors can include contextual features).

Few corpora available; labeling can be non-trivial, requiring aligning each document unit (e.g., sentence) with abstract.

Learns to extract just individual sentences (though feature vectors can include contextual features).

RANLP’2003Page 34


Example1: Kupiec et al. (1995) Input

- Uses a corpus of 188 full-text/abstract pairs drawn from 21 different scientific collections

- Professionally written abstracts 3 sentences long on the average

- The algorithm takes each sentence and computes a probability that it should be included in a summary, based on how similar it is to the abstract

Uses Bayesian classifier

Result

- About 87% (498) of all abstract sentences (568) could be matched to sentences in the source (79% direct matches, 3% direct joins, 5% incomplete joins)

- Location was best feature at 163/498 = 33%

- Para+fixed-phrase+sentence length cutoff gave best sentence recall performance … 217/498=44%

- At compression rate = 25% (20 sentences), performance peaked at 84% sentence recall

RANLP’2003Page 35


Example 2: Mani & Bloedorn (1998)

cmp-lg corpus (xxx.lanl.gov/cmp-lg) of scientific texts, prepared in SGML form by Simone Teufel at U. Edinburgh

198 pairs of full-text sources and author-supplied abstracts Full-text sources vary in size from 4 to 10 pages, dating from

1994-6 SGML tags include: paragraph, title, category, summary,

headings and heading depth (figures, captions and tables have been removed)

Abstract length averages about 5% (avg. 4.7 sentences) of source length

Processing- Each sentence in full-text source converted to feature

vector- 27,803 feature-vectors (reduces to 903 unique vectors)- Generated generic and user focused summaries

RANLP’2003Page 36


Comparison of Learning Algorithms

20% compression, 10 fold cvMethod Pred.

AccuracyPrecision Recall F-score

Naïve Bayes –discretized .69 .70 .65 .67

C4.5 Rules (pruned) .69 .62 .70 .66

AQ .56 .54 .76 .63

SCDF .64 .66 .58 .62

Instance-Based, k=3 .61 .59 .60 .59

Naïve-Bayes-discretized .90 .90 .90 .90

C4.5 Rules (pruned) .89 .88 .91 .89

SCDF .88 .88 .89 .88

Instance-Based, k=3 .82 .80 .85 .82

AQ .76 .70 .92 .80

Gen

eri

cU

ser-

focu

sed

RANLP’2003Page 37


Example Rules

Generic summary rule, generated by C4.5Rules (20% compression)

If sentence is in the conclusion and it is a high tf.idf sentence

Then it is a summary sentence

User-focused rules, generated by AQ (20% compression)

If the sentence includes 15..20 keywords* present

Then it is a summary sentence (163 total, 130 unique)

If the sentence is in the middle third of the paragraph and the paragraph is in the first third of the section

Then it is a summary sentence (110 total, 27 unique)

*keywords - terms occurring in sentences ranked as highly-relevant to query (abstract)

RANLP’2003Page 38


Issues in Learning Sentence Extraction Rules

Choice of corpus

- size of corpus

- availability of abstracts/extracts/judgments

- quality of abstracts/extracts/judgments compression, representativeness, coherence, language, etc.

Choice of labeler to label a sentence as summary-worthy or not based on a comparison between the source document sentence and the document's summary.

- Label a source sentence (number) as summary worthy if it found in the extract

- Compare summary sentence content with source sentence content (labeling by content similarity – L/CS)

- Create an extract from an abstract (e.g., by alignment L/A->E )

Feature Representation, Learning Algorithm, Scoring

RANLP’2003Page 39


L/CS in KPC

To determine if sE, they use a content-based match (since the summaries don’t always lift sentences from the full-text).

They match the source sentence to each sentence in the abstract. Two varieties of matches:

- Direct sentence match: the summary sentence and source text sentence are

identical or can be considered to have the same content. (79% of matches)

- Direct join: two or more sentences from the source text (called

joins) appear to have the same content as a single summary sentence. (3% of matched)

RANLP’2003Page 40


L/CS in MB98: Generic Summaries

For each source text

- Represent abstract (list of sentences)

- Match source text sentences against abstract, giving a ranking for source sentences (ie, abstract as “query”) combined-match: compare source sentence against entire

abstract (similarity based on content-word overlap + weight) individual-match: compare source sentence against each

sentence of abstract (similarity based on longest string match to any abstract sentence)

- Label top C% of the matched source sentences’ vectors as positive C (Compression) = 5,10,15,20,25

- e.g., C=10 => for a 100-sentence source text, 10 sentences will be labeled positive

RANLP’2003Page 41


L/A->E in Jing et al. 98

f1

f2

Find the fr which maximizes P(fr(w1…wn))i.e., using Markov Assumption

P(fr(w1….wn)) i=1,n P(fr(wi)|fr(wi-1))

w1 w2

Abstract Source

RANLP’2003Page 42


Sentence Extraction as Bayesian Classification

P(s| F1,…, Fn) = j=1,nP(Fj|sE) P(sE) / j=1,nP(Fj)

P(sE) - compression rate cP(s| F1,…, Fn) - probability that

sentence s is included in extract E, given the sentence’s feature-value pairs

P(Fj) - probability of feature-value pair occurring in a source sentence

P(Fj|sE) - probability of feature -value pair occurring in a source sentence which is also in the extract

The features are discretized into Boolean features, to simplify matters

RANLP’2003Page 43


ADDING DISCOURSE-LEVEL FEATURES

TO THE MIX

RANLP’2003Page 44


Cohesion

There are links in text, called ties, which express semantic relationships

Two classes of relationships:- Grammatical cohesion

anaphora ellipsis conjunction

- Lexical cohesion synonymy hypernymy repetition

RANLP’2003Page 45


Martian Weather with Grammatical and Lexical Cohesion Relations

With its distant orbit 50 percent farther from the sun than Earth and slim atmospheric blanket, Mars experiences frigid weather conditions. Surface temperatures typically average about 60 degrees Celsius ( 76 degrees Fahrenheit) at the equator and […] can dip to 123 degrees C near the poles. Only the midday sun at tropical latitudes is warm enough to thaw ice on occasion, but any liquid water formed in this way would evaporate almost instantly because of the low atmospheric pressure. Although the atmosphere holds a small amount of water, and water ice clouds sometimes develop, most Martian weather involves blowing dust or carbon dioxide. Each winter, for example, a blizzard of frozen carbon dioxide rages over one pole, and a few meters of this dry ice snow accumulate as previously frozen carbon dioxide evaporates from the opposite polar cap. Yet even on the summer pole, where the sun remains in the sky all day long, temperatures never warm enough to melt frozen water.

RANLP’2003Page 46


Text Graphs based on Cohesion

Represent a text as a graph Nodes: words (or sentences) Links: Cohesion links between nodes Graph Connectivity Assumption:

- More highly connected nodes are likely to carry salient information.

RANLP’2003Page 47


Cohesion based Graphs

Skorochodhko 1972 Salton et al. 1994 Mani & Bloedorn 1997

Node: SentenceLink: RelatedP

Method: node centrality and topology

Node: ParagraphLink: Cosine Similarity

Method: Local segmentation then node centrality

Node: Words/PhrasesLink: Lexical/Grammatical Cohesion

Method: node centrality discovered by spreading activation (see also clustering using lexical chains)

chain

ring

monolith

piecewise

P5

1 2 3

P10

P5

P13

P16

P3

P8P7

P9

P12

P15

P18P19 P21

P23

P24

Link between nodes > 5 apart ignoredBest 30p links at density 2.00, seg_csim 0.26

Facts about an issue

Legality of an issue

RANLP’2003Page 48


Coherence Coherence is the modeling of discourse relations using

different sources of evidence, e.g., - Document format

layout in terms of sections, chapters, etc. page layout

- Topic structure TextTiling (Hearst)

- Rhetorical structure RST (Mann & Mathiessen) Text Grammars (vanDijk, Longacre) Genre-specific rhetorical structures (Methodology,

Results, Evaluation, etc.) (Liddy , Swales, Teufel & Moens, Saggion & Lapalme, etc.)

- Narrative structure

RANLP’2003Page 49


Using a Coherence-based Discourse Model in Summarization

Choose a theory of discourse structure Parse text into a labeled tree of discourse segments, whose

leaves are sentences or clauses- Leaves typically need not have associated semantics

Weight nodes in tree, based on node promotion and clause prominence

Select leaves based on weight Print out selected leaves for summary synthesis

RANLP’2003Page 50


Martian Weather Summarized Using Marcu’s Algorithm (target length = 4 sentences) [With its distant orbit {– 50 percent farther from the sun than Earth –} and

slim atmospheric blanket,1] [Mars experiences frigid weather conditions.2] [Surface temperatures typically average about –60 degrees Celsius (–76 degrees Fahrenheit) at the equator and can dip to –123 degrees C near the poles.3] [Only the midday sun at tropical latitudes is warm enough to thaw ice on occasion,4] [but any liquid water formed that way would evaporate almost instantly5] [because of the low atmospheric pressure.6] [Although the atmosphere holds a small amount of water, and water-ice clouds sometimes develop,7] [most Martian weather involves blowing dust or carbon dioxide.8] [Each winter, for example, a blizzard of frozen carbon dioxide rages over one pole, and a few meters of this dry-ice snow accumulate as previously frozen carbon dioxide evaporates from the opposite polar cap.9] [Yet even on the summer pole, {where the sun remains in the sky all day long,} temperatures never warm enough to melt frozen water.10]

2 > 8 > {3, 10} > {1, 4, 5, 7, 9}

RANLP’2003Page 51


Illustration of Node Promotion (Marcu)

Nodes: RelationsLeaves: Clauses

Nucleus: square boxesSatellite: dotted boxes

RANLP’2003Page 52


Detailed Evaluation of Marcu’s Method Recall Precision Size of Expt.

Clause Segmentation 81.3 90.3 3 texts, 3 judgesDiscourse Marker ID 80.8 89.5 3 texts, 3 judgesSalience Weighting 65.0 67.0 5 texts, 3 judges(Machine-Generated Trees)Salience Weighting 67.0 78.0 5 texts, 3 judges(Human-Generated Trees) Issues

- How well can humans construct trees? Discourse Segmentation .77 Kappa (30 news, 3

coders) Relations .61 Kappa ditto

- How well can machines construct trees? Machine trees show poor correlation with human trees, but

shape and nucleus/satellite assignment very similar

RANLP’2003Page 53


AGENDA





15:30 Break





Summarization

17:30 pm Conclusion

RANLP’2003Page 54


Abstracts Require Deep Methods

An abstract is a summary at least some of whose material is not present in the input.

Abstracts involve inferences made about the content of the text; they can reference background concepts, i.e., those not mentioned explicitly in the text.

Abstracts can result in summarization at a much higher degree of compression than extracts

Human abstractors make inferences in producing abstracts, but are instructed “not to invent anything”

So, “degree of abstraction” knob important. Could control extent of generalization, degree of lexical substitution, aggregation, etc.

So, “degree of abstraction” knob important. Could control extent of generalization, degree of lexical substitution, aggregation, etc.

RANLP’2003Page 55


Template Extraction

Wall Street Journal, 06/15/88

MAXICARE HEALTH PLANS INC and UNIVERSAL HEALTH SERVICES INC have dissolved a joint venture which provided health services.

Synthesis

Analysis

Templates<TEMPLATE-8806150049-1> := DOC NR: 8806150049 CONTENT: <TIE_UP_RELATIONSHIP-8806150049-1> DATE TEMPLATE COMPLETED: 311292 EXTRACTION TIME: 0

Source

Transformation

RANLP’2003Page 56


Template Example (Paice and Jones 1983)Concept Definition

SPECIES the crop species concerned

CULTIVAR the varieties used

HIGH-LEVEL PROPERTY the property being investigated, e.g., yield, growth rate

PEST any pest which infests the crop

AGENT chemical or biological agent applied

INFLUENCE e.g., drought, cold, grazing, cultivation system

LOCALITY where the study was performed

TIME years when the study was conducted

SOIL description of soil

Canned Text Patterns

“This paper studies the effect the pest PEST has on the PROPERTY of SPECIES.”

“An experiment in TIME at LOCALITY was undertaken.”

Output: This paper studies the effect the pest G. pallida has on the yield of potato.

An experiment in 1985 and 1986 at York, Lincoln and Peterbourgh, England

was undertaken.

RANLP’2003Page 57


Templates Can get Complex! (MUC-5)<TEMPLATE-8806150049-1> := DOC NR: 8806150049 CONTENT: <TIE_UP_RELATIONSHIP-8806150049-1> DATE TEMPLATE COMPLETED: 311292 EXTRACTION TIME: 0<TIE_UP_RELATIONSHIP-8806150049-1> := TIE-UP STATUS: DISSOLVED ENTITY: <ENTITY-8806150049-1> <ENTITY-8806150049-2> JOINT VENTURE CO: <ENTITY-8806150049-3> OWNERSHIP: <OWNERSHIP-8806150049-1> <OWNERSHIP-8806150049-2> ACTIVITY: <ACTIVITY-8806150049-1><ENTITY-8806150049-1> := NAME: Maxicare Health Plans INC ALIASES: "Maxicare" LOCATION: Los Angeles (CITY 4) California (PROVINCE 1) United States (COUNTRY) TYPE: COMPANY ENTITY RELATIONSHIP: <ENTITY_RELATIONSHIP-8806150049-1><ENTITY-8806150049-2> := NAME: Universal Health Services INC ALIASES: "Universal Health" LOCATION: King of Prussia (CITY) Pennsylvania (PROVINCE 1) United States (COUNTRY) TYPE: COMPANY ENTITY RELATIONSHIP: <ENTITY_RELATIONSHIP-8806150049-1><ACTIVITY-8806150049-1> := INDUSTRY: <INDUSTRY-8806150049-1> ACTIVITY-SITE: (<FACILITY-8806150049-1> <ENTITY-8806150049-3>)<INDUSTRY-8806150049-1> := INDUSTRY-TYPE: SERVICE PRODUCT/SERVICE: (80 "a joint venture Nevada health maintenance [organization]")

RANLP’2003Page 58


Assessment of Template Method

Characteristics:- Templates can be simple or complex, and there may be multiple

templates (e.g., multi-incident document)

- Templates (and sets of them) benefit from aggregation and elimination operations to pinpoint key summary information

- Salience is pre-determined based on slots, or computed (e.g., event frequencies)

Advantages:

- Provides a useful capability for abstracting semantic content

- Steady progress in information extraction, based on machine learning from large corpora

Limitations:

- Requires customization for specific types of input data

- Only summarizes that type of input data

RANLP’2003Page 59


Concept Abstraction Method

Captures the content of a document in terms of abstract categories

Abstract categories can be - sets of terms from the document- topics from labeled collections or background knowledge

(e.g., a thesaurus or knowledge base) To leverage background knowledge

- Obtain an appropriate concept hierarchy- Mark concepts in hierarchy with their frequency of

reference in the text requires word-sense disambiguation

- Find the most specific generalizations of concepts referenced in the text

- Use the generalizations in an abstract

RANLP’2003Page 60


Concept Abstraction Example

|| )()|(|)|(|

CActInstsCActInstsCInsts

Salient (C) iff

)(1

)1()()(CchildC

CWCfreqCW

Counting Concept and Instance Links(Hahn & Reimer ‘99)

Counting Concept and Subclass Links(Lin & Hovy ‘99)

Most Specific Generalization: Traverse downwards untilyou find C whose children contributeequally to its weight Sun

Workstation

The department is buying a Sun Workstation, a HP 3690, and a Toshiba machine. The IBM ThinkPad will not be bought from next year onwards.

IBM ThinkPad

RANLP’2003Page 61


Assessment of Concept Abstraction

Allows for Generalization based on links (instance, subclass, part-of, etc.)

Some efforts at controlling extent of generalization Hierarchy needs to be available, and contain domain (senses

of) words- Generic hierarchies may contain other senses of word- Constructing a hierarchy by hand for each domain is

prohibitively expensive Result of generalization needs to be readable by human (e.g.,

generation, visualization)- So, useful mainly in transformation phase

RANLP’2003Page 62


Generation (Statistical) of Headlines

Shows how statistical methods can be use to generate abstracts (Banko et al. 2000)

))))((log(

))|(log(

))|(log(

(maxarg

*

21

1

nHlenP

wwP

DwHwP

s

n

iii

n

iii

H

Select doc words that occur frequently in example headlines

Select doc words that occur frequently in example headlines

Order words based on pair co-occurrences

Order words based on pair co-occurrences

Length of headlineLength of headline

H=headline, D=docH=headline, D=doc

RANLP’2003Page 63


AGENDA





15:30 Break





Summarization

17:30 pm Conclusion

RANLP’2003Page 64


Summarization Evaluation: Intrinsic and Extrinsic Methods

Intrinsic methods test the system in itself- Criteria

Coherence Informativeness

- Methods Comparison against reference output Comparison against summary input

Extrinsic methods test the system in relation to some other task

- time to perform tasks, accuracy of tasks, ease of use- expert assessment of usefulness in task

RANLP’2003Page 65


Coherence: How does a summary read?

Humans can judge this by subjective grading (e.g., 1-3 scale) on specific criteria

- General readability criteria: spelling, grammar, clarity, impersonal style, conciseness, readability and understandability, acronym expansion, etc. (Saggion and LaPalme 2000)

- Criteria can also be specific to extracts (dangling anaphors, gaps,etc.) or abstracts (ill-formed sentences, inappropriate terms, etc.)

When subjects assess summaries for coherence, the scores can be compared against scores for reference summaries, scores for source docs, or against scores for other summarization systems

Automatic scoring has a limited role to play here

RANLP’2003Page 66


Informativeness: Is the content preserved?

Measure the extent to which summary preserves information from a source or a reference summary

Humans can judge this by subjective grading (e.g., 1-3 scale) on specific criteria

When subjects assess summaries for informativeness, the scores can be compared against scores for reference summaries, scores for source docs, or against scores for other summarization systems

SourceDocument

HumanSummary

(Reference)

MachineSummary

MachineSummary

CompareComparison method

can be manual or automatic

Comparison method

can be manual or automatic

RANLP’2003Page 67


Human Agreement in Reference Extracts

Previous studies, most of which have focused on extracts, have shown evidence of low agreement among humans

Source #docs #subjects % agreement CiteScientific American 10 6 8% Rath et al. 61Funk and Wagnall's 50 2 46% Mitra et al. 97

However, there is also evidence that judges may agree more on the most important sentences to include (Jing et al. 99), (Marcu 99)

When subjects disagree, system can be compared against majority opinion, most similar human summary (‘optimistic’) or least similar human summary (‘pessimistic’) (Mitra et al. 97)

RANLP’2003Page 68


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.0 0.1 0.2 0.3 0.4 0.5

Compression

Ave

rag

e A

nsw

er

Re

call

(A

RA

)

CGI/CMU

Cornell/SabIR

GE

ISI

NMSU

Penn

SRA

TextWise

Modsumm

1 3 2

1

2

3

1,2

23

1

1

2

3

13

1

2

31

2

3

1

2

3

2 3

Intrinsic Evaluation: SUMMAC Q&A Results

Highest recall associated with the least reduction of the source

Highest recall associated with the least reduction of the source

informativeness ratio of accuracy to compression of about 1.5.

Content-based automatic scoring (vocabulary overlap) correlates very well with human scoring (passage/answer recall)

Content-based automatic scoring (vocabulary overlap) correlates very well with human scoring (passage/answer recall)

RANLP’2003Page 69


Intrinsic Evaluation: Japanese Text Summarization Challenge (2000)

At each compression, systems outperformed Lead and TF baselines in content overlap with human summaries

Subjective grading of coherence and informativeness showed that human abstracts > human extracts > systems and baselines

Against ExtractsAgainst Extracts

Against AbstractsAgainst Abstracts

Subjective GradingSubjective Grading

(Fukusima and Okumura 2001)

RANLP’2003Page 70


DUC’2001 Summarization Evaluation http://www-nlpir.nist.gov/projects/duc/

Intrinsic evaluation of single and multiple doc English summaries by comparison against referenced summaries

60 reference sets: 30 training, 30 test, each with an average of 10 documents

a single 100-word summaries for each document (sds)

four multi-document summaries (400, 200, 100, and 50-word) for each set (mds)

www.isi.edu/~cyl/SEE

RANLP’2003Page 71


DUC’2001 Setup

doc sets are on

- A single event with causes and consequences

- Multiple distinct events of a single type (e.g., solar eclipses)

- Subject (discuss a single subject)

- One of the above in the domain of natural disasters (e.g., Hurricane Andrew)

- Biographical (discuss a single person))

- Opinion (different opinions about the same subject, e.g., welfare reform)

400-word mds used to build 50, 100, and 200-word mds

Baselines

- sds - first 100 words

- mds

1st 50, 100, 200, 400 in most recent

1st sentence in 1st, 2nd, ..nth doc, 2nd sentence, …until 50/100/200/400

RANLP’2003Page 72


Eval Criteria

Informativeness (Completeness)

- Recall of reference summary units

Coherence (1-5 scales)

- Grammar: “Do the sentences, clauses, phrases, etc. follow the basic rules of English?

Don’t worry here about style or the ideas.

Concentrate on grammar.”

- Cohesion: “Do the sentences fit in as they should with the surrounding sentences?

Don’t worry about the overall structure of the ideas.

Concentrate on whether each sentence naturally follows the preceding one and leads into the next.”

- Organization: “Is the content expressed and arranged in an effective manner?

Concentrate here on the high-level arrangement of the ideas.”

RANLP’2003Page 73


Assessment Phase 1: assessor judged system summary against her own

reference summary Phase 2: assessor judged system summary against 2 others’

reference summaries System summaries divided into automatically determined sentences

(called PUs) Reference summaries divided into automatically determined EDU’s

(called MUs), which were then lightly edited by humans

RANLP’2003Page 74


Results: Coherence

Grammar- Baseline < System <

Humans (3.23, 3.53. 3.79 means)

- Most baselines contained a sentence fragment

Cohesion- Baseline=system=humans

=3 (sds medians)- Baseline=2=system<huma

ns=3 (mds medians) Organization

- Baseline=3=system<humans=4 (sds)

- Baseline=2=system<humans=3(mds)

Grammar (esp. ‘All’) too sensitive to low-level formatting

Cohesion/Organization- Cohesion and

Organization didn’t make sense for very short summaries

- Cohesion hard to distinguish from Organization

Overall, except for grammar, system summaries no better than baselines

RANLP’2003Page 75


Informativeness (Completeness) Measure

For each MU:

“The marked PUs, taken together, express [ All, Most, Some, Hardly any, or None ]of the meaning expressed by the MU”

RANLP’2003Page 76


Results: Informativeness

Average Coverage: Average of the per-MU completeness judgments [0..4] for a peer summary

Baselines =.5 <= systems =.6 < humans=1.3 (overall medians) lots of outliers relatively lower baseline and system performance on mds small improvements in mds as size increases

Even for simple sentences/EDU’s, determination of shared meaning was very hard!

RANLP’2003Page 77


Short multi-docsummary

DUC’2003 (NIST slide)TDTdocs

TRECdocs

Novelty docs

Very short single-doc summaries Short

multi-docsummary

Short multi-docsummary

TREC Novelty topic

Relevant/novelsentences

Very short single-doc summaries

+

TDT topic+

Viewpoint

Task 2

Task 3

Task 4

Task 1

+

30 clusters

30 clusters

30 clusters

10 words

100 words

100 words

100 words

RANLP’2003Page 78


DUC’2003 Metrics & Results

Coherence: Quality (Tasks 2-4):

- Systems < Baseline <= Manual

Informativeness:

- Coverage (Tasks 1-4) =avg(per-MU completeness judgments for a peer summary) * target length / actual length

Systems < Manual; most systems indistinguishable

- ‘Usefulness’ (Task 1) Grade each summary according to how useful you think it would be in getting you to choose the document

Manual summaries distinct from systems; tracks coverage closely

- ‘Responsiveness’ (Task 4) Read the topic/question and all the summaries. Consult the relevant sentences as needed. Grade each summary according to how responsive it is in form and content to the question.

Manual summaries distinct from systems/baselines; tracks coverage generally

RANLP’2003Page 79


Baseline summaries etc. (NIST slide)

NIST (Nega Alemayehu) created baseline summaries

- Baselines 2-5: automatic

- based roughly on algorithms suggested by Daniel Marcu

- no truncation of sentences, so some baseline summaries went over the limit (+ <=15 words) and some were shorter than required)

Original author’s headline 1 (task 1)

- Use the document’s own “headline” element

Baseline 2 (tasks 2, 3)

- Take the 1st 100 words in the most recent document.

Baseline 3 (tasks 2, 3)

- Take the 1st sentence in the 1st, 2nd, 3rd,… document in chronological sequence until you have 100 words.

Baseline 4 (task 4)

- Take the 1st 100 words from the 1st n relevant sentences in the 1st document in the set. ( Documents ordered by relevance ranking given with the topic.)

Baseline 5 (task 4)

- Take the 1st relevant sentence from the 1st, 2nd, 3rd,… document until you have 100 words. (Documents ordered by relevance ranking given with the topic.)

RANLP’2003Page 80


Extrinsic Methods: Usefulness of Summary in Task If the summary involves instructions of some kind, it is possible to

measure the efficiency in executing the instructions.

measure the summary's usefulness with respect to some information need or goal, such as

- finding documents relevant to one's need from a large collection, routing documents

- extracting facts from sources

- producing an effective report or presentation using a summary

- etc.

assess the impact of a summarizer on the system in which it is embedded, e.g., how much does summarization help the question answering system?

measure the amount of effort required to post-edit the summary output to bring it to some acceptable, task-dependent state

…. (unlimited number of tasks to which summarization could be applied)

RANLP’2003Page 81


SUMMAC Time and Accuracy (adhoc task, 21 subjects)

Conclusion - Adhoc

S2’s save time by 50% without impairing accuracy!

Conclusion - Adhoc

S2’s save time by 50% without impairing accuracy!

S2’s (23% of source on avg.) roughly halved decision time rel. to F (full-text)!

S2’s (23% of source on avg.) roughly halved decision time rel. to F (full-text)!

All F-score and Recall differences are significant except between F& S2

All F-score and Recall differences are significant except between F& S2

All time differences are significant except between B & S1

All time differences are significant except between B & S1

RANLP’2003Page 82


AGENDA





15:30 Break





Summarization

17:30 pm Conclusion

RANLP’2003Page 83


Multi-Document Summarization

Extension of single-document summarization to collections of related documents

- but naïve “concatenate each summary” extension is faced with repetition of information across documents

Requires fusion of information across documents- Elimination, aggregation, and generalization operations carried

out on collection instead of individual documents

Collections can vary considerably in size- different methods for different ranges (e.g, cluster first if > n)

Higher compression rates usually needed- perhaps where abstraction is really critical

NL Generation and Visualization have an obvious role to play here

RANLP’2003Page 84


Example MDS Problems

Timothy James McVeigh, 27, was formally charged on Fri day with the bombing of a federal building in OklahomaCity which killed at least 65 people, the Justice Depart ment said.


The first suspect, Gulf War veteran Timothy McVeigh, 27,was charged with the bombing Friday after being arrestedfor a traffic violation shortly after Wednesday's blast.

Federal agents have arrested suspect in the OklahomaCity bombing Timothy James McVeigh, 27. McVeigh wasformally charged on Friday with the bombing.

Timothy McVeigh, the man charged in the Oklahoma Citybombing, had correspondence in his car vowing revengefor the 1993 federal raid on the Branch Davidian com pound in Waco, Texas, the Dallas Morning News saidMonday.

Eighteen decapitated bodies have been found in a

mass grave in northern Algeria, press reports

said Thursday.

Algerian newspapers have reported on Thursday

that 18 decapitated bodies have been found by

the authorities.

RANLP’2003Page 85


Multi-Document Summarization Methods

Shallow Approaches- passage extraction and comparison

removes redundancy by vocabulary overlap comparisons

Deep Approaches- template extraction and comparison

removes redundancy by aggregation and generalization operators

- syntactic and semantic passage comparison

RANLP’2003Page 86


Passage Extraction and Summarization

Maximal Marginal Relevance Example: 100 hits - 1st 20 same event, but 36, 41, 68 very different,

although marginally less relevant

As a post-retrieval filter to retrieval of relevance-ranked hits, offers a reranking parameter which allows you to slide between relevance to query and diversity from hits you have seen so far.

MMR(Q, R, S) = ArgmaxDi in R\S[sim1(Di, Q) - (1-) maxDj in R sim2(Di, Dj)]

where Q is the query, R is the retrieved set, S is the scanned subset of R

Example:

R={D1, D2, D3}; S= {D1}; =0

Dj=D2=>-(1- )sim2(D2,D1) = -.4

Dj=D3=>-(1- )sim2(D2,D1) = -.2, so pick D3

Cohesion-Based Approaches Across Documents

- Salton’s Text Maps

- User-Focused Passage Alignment

QD1

D2

D3

RANLP’2003Page 87


User-Focused Passage Alignment

RANLP’2003Page 88


Template Comparison Method (McKeown and Radev 1995)

Contradiction operator: applies to template pairs which have same incident location but which originate from different sources (provided at least one other slot differs in value)

- If value of number of victims is lowered across two reports from the same source, this suggests the old information is incorrect; if it goes up, the first report had incomplete information

The afternoon of Feb 26, 1993, Reuters reported that a suspected bomb killed at least

five people in the World Trade Center. However, Associated Press announced that

exactly five people were killed in the blast.

Refinement operator: applies to template pairs where the second’s slot value is a specialization of the first’s for a particular slot (e.g., terrorist group identified by country in first template, and by name in later template)

Other operators: perspective change, agreement, addition, superset, trend, etc.

RANLP’2003Page 89


Syntactic Passage Comparison (MultiGen)



The first suspect, Gulf War veteran Timothy McVeigh, 27,was charged with the bombing Friday after being arrestedfor a traffic violation shortly after Wednesday's blast.

Federal agents have arrested suspect in the OklahomaCity bombing Timothy James McVeigh, 27. McVeigh wasformally charged on Friday with the bombing.

Timothy McVeigh, the man charged in the Oklahoma Citybombing, had correspondence in his car vowing revengefor the 1993 federal raid on the Branch Davidian com pound in Waco, Texas, the Dallas Morning News saidMonday.

Example Theme for SyntacticComparison

Assumes very tight clustering of documents.

Similar to revision-based methods

Assumes very tight clustering of documents.

Similar to revision-based methods

RANLP’2003Page 90


Lexical Semantic Merging: BIOGEN

Vernon Jordan is a presidential friend and a Clinton adviser. He helped Ms. Lewinsky find a job. He testified that Ms. Monica Lewinsky said that she had conversations with the president, that she talked to the president.

Henry Hyde is a Republican chairman of House Judiciary Committee and a prosecutor in Senate impeachment trial. He will lead the Judiciary Committee's impeachment review. Hyde urged his colleagues to heed their consciences , “the voice that whispers in our ear , ‘duty, duty, duty.’”

.

• Given 1,300 news docs• 707,000 words in collection• 607 sentences which mention “Jordan” by name• 78 appositive phrases which fall (using WordNet) into 2 semantic groups: “friend”, “adviser”; • 65 sentences with “Jordan” as logical subject, filtered based on verbs which are strongly associated in a background corpus with “friend” or “adviser”, e.g., “testify”, “plead”, “greet”• 3 sentence summary

For details, see Mani et al. ACL’2001

RANLP’2003Page 91


Appositive Merging Examples

Wisconsinmf Democrat senior+ Democrat

a lawyer for the defendant

an attorney

for Paula Jones+

Chairman of the Budget Committee + Budget Committee Chairman

lawyermf attorney

person

+

synonym

Senatormf Democrat

politician

leader

person

+A=B

A, B < X < Person

mf: more frequent head/modifier for name in collection

RANLP’2003Page 92


Verb-subject associations for appositive head nouns

executive police politician

reprimand 16.36 shoot 17.37 clamor 16.94

conceal 17.46 raid 17.65 jockey 17.53

bank 18.27 arrest 17.96 wrangle 17.59

foresee 18.85 detain 18.04 woo 18.92

conspire 18.91 disperse 18.14 exploit 19.57

convene 19.69 interrogate 18.36 brand 19.65

plead 19.83 swoop 18.44 behave 19.72

sue 19.85 evict 18.46 dare 19.73

answer 20.02 bundle 18.50 sway 19.77

commit 20.04 manhandle 18.59 criticize 19.78

worry 20.04 search 18.60 flank 19.87

accompany 20.11 confiscate 18.63 proclaim 19.91

own 20.22 apprehend 18.71 annul 19.91

witness 20.28 round 18.78 favor 19.92

RANLP’2003Page 93


MULTIMEDIA SUMMARIZATION

RANLP’2003Page 94


Broadcast News Navigator Example

InternetQuery terms constructed from Nes

Hits are then summarized

InternetQuery terms constructed from Nes

Hits are then summarized

Sentence extraction from cc, plus list of NEs

Sentence extraction from cc, plus list of NEs

RANLP’2003Page 95


BNN Summary: Story Skim*

RANLP’2003Page 96


BNN Story Details* textsummary

topics

namedentities

RANLP’2003Page 97


Identification:Precision vs. Time (with Recall Comparison)

0.7

0.75

0.8

0.85

0.9

0.95

1

0 2 4 6 8Average Time

(minutes)

Ave

rag

e P

reci

sio

n

3 Named Entities

All Named Entities

Full Details

Key Frame

Skim Story Details

Summary

Text

Topic Video

Also High Recall

Lower Recall High PrecisionA

B

C

IDEAL

Results • Less is better

(in time and precision)• Mixed media

summaries better than single media

E.g., What stories are about Sonny Bono?

RANLP’2003Page 98


CMU Meeting Summarization (Zechner 2001)

S1: well um I think we should discuss this you know with her

S1: That’s true I suggest

S1: you talk to him

S1: yeah well now get this we might go to live in switzerland

S2: oh really

S1: yeah because they’ve made him a job offer there and at first thinking nah he wasn’t going to take it but now he’s like

S1: when are we meeting?

S2: you mean tomorrow?

S1: yes

S2: at 4 pm

Summarizes audio transcriptions from multi-party dialogs

Integrated with meeting browser

Detects disfluencies: filled pauses, repairs, restarts, false starts

Identifies sentence boundaries

Identifies question-answer pairs

Then does sentence ranking using MMR

When run on automatically transcribed audio, biases summary towards words the recognizer is confident of

RANLP’2003Page 99


Event Visualization and Summarization:Geospatial News on Demand Env. (GeoNODE)

Automated Cross Document, Multilingual Topic Cluster Detection and Tracking

Geospatial and Temporal Display of Events extracted from Corpus

Event Frequency by Source

VCR like controls supports exploration of corpus

RANLP’2003Page 100


Multilingual Summarization (ISI)

Indonesian hitsIndonesian hits

SummarySummary

Machine Translation

Machine Translation



Conclusion

Automatic Summarization is alive and well! As we interact with the massive information universes of

today and tomorrow, summarization in some form is indispensable

Areas for the future- multidocument summarization- multimedia summarization- summarization for hand-held displays- temporal summarization- etc.



Resources

Books

- Mani, I. and Maybury, M. (eds.) 1999. Advances in Automatic Text Summarization. MIT Press, Cambridge.

- Mani, I. 2001. Automated Text Summarization. John Benjamins, Amsterdam.

Journals

- Mani, I. And Hahn, U. Nov 2000. Summarization Tutorial. IEEE Computer.

Conferences/Workshops

- Dagstuhl Seminar, 1993 (Karen Spärck Jones, Brigitte Endres-Niggemeyer) www.ik.fh-

hannover.de/ik/projekte/Dagstuhl/Abstract

- ACL/EACL Workshop on Intelligent Scalable Text Summarization, Madrid, 1997 (Inderjeet

Mani, Mark Maybury) (www.cs.columbia.edu/~radev/ists97/program.html)

- AAAI Spring Symposium on Intelligent Text Summarization, Stanford, 1998 (Dragomir Radev,

Eduard Hovy) (www.cs.columbia.edu/~radev/aaai-sss98-its)

- ANLP/NAACL Summarization Workshop, Seattle, 2000 (Udo Hahn, Chin-Yew Lin, Inderjeet

Mani, Dragomir Radev) www.isi.edu/~cyl/was-anlp2000.html

- NAACL Summarization Workshop, Pittsburgh, 2001



Web References

On-line Summarization Tutorials

- www.si.umich.edu/~radev/summarization/radev-summtutorial00.ppt

- www.isi.edu/~marcu/coling-acl98-tutorial.html

Bibliographies

- www.si.umich.edu/~radev/summarization/

- www.cs.columbia.edu/~jing/summarization.html

- www.dcs.shef.af.uk/~gael/alphalist.html

- www.csi.uottawa.ca/tanka/ts.html

Survey: “State of the Art in Human Language Technology” (cslu.cse.ogi.edu/HLTsurvey)

Government initiatives

- DUC Multi-document Summarization Evaluation (www-nlpir.nist.gov/projects/duc)

- DARPA’s Translingual Information Detection Extraction and Summarization (TIDES) Program

(tides.nist.gov, www.darpa.mil/ito/research/tides/projlist.html)

- European Intelligent Information Interfaces program (www.i3net.org)



AGENDA