21
Text Summarization For Review And Feedback BY :Aman Sadhwani 1 6/18/22

TEXT SUMMARIZATION

Embed Size (px)

Citation preview

Page 1: TEXT SUMMARIZATION

Saturday, April 15, 2023

1

Text SummarizationFor Review And

FeedbackBY :Aman Sadhwani

Page 2: TEXT SUMMARIZATION

Saturday, April 15, 2023

2

What is Text Summarization?And why we need it?

• We can define summary as a text which reflects the main and important sentences from the original text. In Text summarization, Summary is generated by Computer.

• In Recent Years we are witnessing the amount of textual information is increasing day by day .The Textual Information grows rapidly. It becomes more difficult for the user to read the textual information and also it leads to loss of interest. That is the reason why Text Summarization came into picture which will solve this problem.

Page 3: TEXT SUMMARIZATION

Saturday, April 15, 2023

3

Types of Text Summarization

1) Extraction: - In Extractive text summarization , summary is generated by selecting a set of words, phrases, paragraph or sentences from the original document.

2) Abstraction: - Abstractive methods are based on semantic representation and then use natural language processing techniques to generate a summary that is nearer to summary generated manually. This kind of summary may contain words that are not found in the original document. Currently research is going on this method and demand for this method is more.

Page 4: TEXT SUMMARIZATION

Proposed System

4Saturday, April 15, 2023

We have developed and compared two text summarization techniques

1) Reduction based

2) Inter section based

Page 5: TEXT SUMMARIZATION

Saturday, April 15, 2023

5

How Reduction Algorithm Works

Step 1 - It takes a text as input. 

Step 2 - Splits it into one or more paragraph(s).

Step 3 - Splits each paragraph into one or more sentence(s).

Step 4 - Splits each sentence into one or more words.

Step 5 - Gives each sentence weight-age (a floating point value) by comparing Its words to a pre-defined dictionary called "stopWords.txt“

If some word of a sentence matches to any word with the pre-defined Dictionary, then the word is considered as Low weighted.

Page 6: TEXT SUMMARIZATION

Saturday, April 15, 2023

6

Cont..

Step 6 - An ordered list of weighted sentences is then prepared (Relatively High weighted sentences comes first and low weighted sentences comes At last position).

Step 7 - Now, we have the ordered list of weighted sentences, it continues to Store each sentence (from ordered weighted sentences) in the output Variable (i.e. a list) until it reaches the reduction ratio (It uses A formula to determine max number of sentences to put in the output List)

Step 8 - The output list is then returned.

Page 7: TEXT SUMMARIZATION

Saturday, April 15, 2023

7

How InterSection Algorithm Works?

1. Split input text into Paragraph.

2. Split paragraph into sentences.

3. Split sentences into words.

4. Calculate the intersection between 2 sentences.

5. Remove non-alphabetic characters from sentence.

6. Convert content into dictionary.

7. Build the sentence dictionary.

8. Return best sentences in a paragraph.

9. Get the best sentences according to dictionary.

Page 8: TEXT SUMMARIZATION

Saturday, April 15, 2023

8

Flow Chart

Page 9: TEXT SUMMARIZATION

Saturday, April 15, 2023

9

Screen shots

Page 10: TEXT SUMMARIZATION

Saturday, April 15, 2023

10

Page 11: TEXT SUMMARIZATION

Saturday, April 15, 2023

11

Page 12: TEXT SUMMARIZATION

Saturday, April 15, 2023

12

Page 13: TEXT SUMMARIZATION

Saturday, April 15, 2023

13

Page 14: TEXT SUMMARIZATION

Saturday, April 15, 2023

14

Page 15: TEXT SUMMARIZATION

Saturday, April 15, 2023

15

Conclusion

Page 16: TEXT SUMMARIZATION

Saturday, April 15, 2023

16

Cont…

By looking at last table we can say that intersection is faster than reduction

But reduction creates better summary than intersection.

Intersection works fine on some documents but generates only 1 or 2 line of summary on some documents.

This is because intersection is the most basic algorithm for text summarization. It doesn’t use any NLP libraries like reduction.

Page 17: TEXT SUMMARIZATION

Hardware & Software requirement

17Saturday, April 15, 2023

Minimum Hardware Requirements

Processor : Intel Pentium II or Higher RAM : 128 Mb or Higher Monitor ,Keyboard, Mouse Printer (Optional) Hard disk : 20 GB Or Higher

Software Requirements

OS: Windows xp or higher Java Installed On Machine Python 2.7 installed on machine.

Page 18: TEXT SUMMARIZATION

Saturday, April 15, 2023

18

Tools used

NetBeans

Python 2.7 IDLE

Page 19: TEXT SUMMARIZATION

Saturday, April 15, 2023

19

References

http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume22/erkan04a-html/erkan04a.html

http://www.iajet.org/iajet_files/vol.1/no.4/Text%20Summarization%20Extraction%20System%20TSES%20Using%20Extracted%20Keywords_doc.pdf

http://en.wikipedia.org/wiki/Sentiment_analysis

Page 20: TEXT SUMMARIZATION

Saturday, April 15, 2023

20

Future enhancement

Will support summarization for multiple file types.

User wise Document management.

Multi document summarization.

Improved summarization algorithms.

Page 21: TEXT SUMMARIZATION

Saturday, April 15, 2023

21

THANK YOU