30
Optimization of Line Segmentation Techniques for Thai Handwritten Document Olarik Surinta Mahasarakham University Thailand

Optimization of Line Segmentation Techniques for Thai Handwritten Document

  • Upload
    etana

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

Optimization of Line Segmentation Techniques for Thai Handwritten Document. Olarik Surinta Mahasarakham University Thailand. Introduction. In handwritten recognition, the line segmentation is an essential scheme. - PowerPoint PPT Presentation

Citation preview

Page 1: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Optimization of Line Segmentation Techniques for Thai Handwritten

Document

Olarik Surinta

Mahasarakham UniversityThailand

Page 2: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Introduction

• In handwritten recognition, the line segmentation is an essential scheme.

• The occurrence of an inaccurately line segmentation will cause errors in the character segmentation.

• Most of line segmentation techniques have been based on horizontal projection profile technique.

10/21/2009 SNLP2009 2

Page 3: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Introduction (cont)

• The texts in most document images are aligned along horizontal lines.

• Projection profile based techniques may be one of the most successful top-down algorithms

10/21/2009 SNLP2009 3

Page 4: Optimization of Line Segmentation Techniques for Thai Handwritten Document

The characteristic of Thai character

Character types CharacterConsonants ก ข ฃ ค ฅ ฆ ง จ ฉ ช ซ ฌ

ญ ฎ ฏ ฐ ฑ ฒ ณ ด ต ถ ท ธน บ ป ผ ฝ พ ฟ ภ ม ย ร ฤล ฦ ว ศ ษ ส ห ฬ อ ฮ

Vowels อั อะ อา อิ อี อึ อื อุ อูเอ โอ ใอ ไอ ๆ อ็ อ์ อํ

Tones อ่ อ้ อ๊ อ๋10/21/2009 SNLP2009 4

Page 5: Optimization of Line Segmentation Techniques for Thai Handwritten Document

The characteristic of Thai character (cont)

10/21/2009 SNLP2009 5

Thai sentence structure

Page 6: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Based on the horizontal projection profile

• The horizontal projection profile is used in dividing the text image into character line.

10/21/2009 SNLP2009 6

Page 7: Optimization of Line Segmentation Techniques for Thai Handwritten Document

The line segmentation techniques in this research

• 1. The horizontal projection technique• 2. The stripe technique• 3. The comparing Thai character technique• 4. The sorting and distinguishing• (all of techniques based on horizontal projection profile)

10/21/2009 SNLP2009 7

Page 8: Optimization of Line Segmentation Techniques for Thai Handwritten Document

1. The horizontal projection profile

10/21/2009 SNLP2009 8

resultHorizontal histogramImage document

Page 9: Optimization of Line Segmentation Techniques for Thai Handwritten Document

2. The stripe technique

• Firstly, the stripe technique divides image into stripe (small column).

• After that, the horizontal projection profile is used to divided the text image into character lines.

10/21/2009 SNLP2009 9

Page 10: Optimization of Line Segmentation Techniques for Thai Handwritten Document

2. The stripe technique (cont)

10/21/2009 SNLP2009 10

The result of strip technique for horizontal projection profile

Page 11: Optimization of Line Segmentation Techniques for Thai Handwritten Document

3. Technique for comparing Thai character

• This technique takes advantage of the differences in size of characters to differentiate Thai characters between consonants and a group of small vowels and tones.

10/21/2009 SNLP2009 11

Comparing between consonant and a group of small vowel and tone

Page 12: Optimization of Line Segmentation Techniques for Thai Handwritten Document

3. Technique for comparing Thai character (cont)

• First step• The groups are divided into two groups (upper

and lower zone). The higher group is then used to define the line from the image document

10/21/2009 SNLP2009 12

Page 13: Optimization of Line Segmentation Techniques for Thai Handwritten Document

3. Technique for comparing Thai character (cont)

10/21/2009 SNLP2009 13

The result of first step of comparison Thai character technique

Page 14: Optimization of Line Segmentation Techniques for Thai Handwritten Document

3. Technique for comparing Thai character (cont)

• Second step• consider the high value of white pixel between

the line markers and choose a new line marker

10/21/2009 SNLP2009 14

Page 15: Optimization of Line Segmentation Techniques for Thai Handwritten Document

3. Technique for comparing Thai character (cont)

• this technique is complex as there are many steps to be proved.

10/21/2009 SNLP2009 15

The result of comparing Thai character technique.

Page 16: Optimization of Line Segmentation Techniques for Thai Handwritten Document

4. The new technique for sorting and distinguishing

• This technique is not complicated and suitable for Thai character.

• Firstly• Use the histogram of horizontal projection profile

to sort the group of black pixels by starting with the minimum to maximum of black pixel.

10/21/2009 SNLP2009 16

Page 17: Optimization of Line Segmentation Techniques for Thai Handwritten Document

4. The new technique for sorting and distinguishing (cont)

10/21/2009 SNLP2009 17

Sorting the group of black pixels.

Page 18: Optimization of Line Segmentation Techniques for Thai Handwritten Document

4. The new technique for sorting and distinguishing (cont)

• Secondly• Find the maximum difference between two

groups of black pixels• The line marker is marked on the middle of the

group of black pixels when the maximum difference value is less than value of the group of black pixels

10/21/2009 SNLP2009 18

Page 19: Optimization of Line Segmentation Techniques for Thai Handwritten Document

4. The new technique for sorting and distinguishing (cont)

10/21/2009 SNLP2009 19

The result of second step of sorting and distinguishing technique.

Page 20: Optimization of Line Segmentation Techniques for Thai Handwritten Document

4. The new technique for sorting and distinguishing (cont)

• Finally• A new line marker is placed in the middle

between every two conjunction line markers

10/21/2009 SNLP2009 20

Page 21: Optimization of Line Segmentation Techniques for Thai Handwritten Document

4. The new technique for sorting and distinguishing

10/21/2009 SNLP2009 21

Click me to play this video.

Page 22: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Experimental result

• Thai image documents were generated from different peoples.

• Data sets contained varieties of writing styles, and limited to only single-column Thai image documents.

10/21/2009 SNLP2009 22

Page 23: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Experimental result (cont)

10/21/2009 SNLP2009 23

Single-Column

Not Single-Column

Page 24: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Experimental result (cont)

• The line marker is used to define the character line.

• The line makers pass through the image document and do not cross the group of black pixels (line segment is completed).

10/21/2009 SNLP2009 24

Page 25: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Experimental result (cont)

10/21/2009 SNLP2009 25

Complete line segmentation

Page 26: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Experimental result (cont)

10/21/2009 SNLP2009 26

incomplete line segmentation

Page 27: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Experimental result (cont)

Number of lines on image documents

percentage T1 T2 T3 T4

4 46 35 92 100

5 32 26 94 99

6 26 15 88 100

7 31 24 91 96

8 21 32 90 97

9 15 11 85 95

10 18 9 88 97

11 23 12 90 94

12 7 11 88 96

Average 24.33 19.44 89.55 97.11

10/21/2009 SNLP2009 27

T1 is Horizontal projection techniqueT2 is Stripe techniqueT3 is Comparing Thai characterT4 is Sorting and distinguishing

Page 28: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Conclusion

• I have presented four techniques for the line segmentation of Thai language– Horizontal projection profile– Stripe– Comparing Thai character– Sorting and distinguishing

• 4 techniques based on horizontal projection profile

10/21/2009 SNLP2009 28

Page 29: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Conclusion (cont)

• The accuracy of the techniques are– The horizontal projection technique 24.33%– The stripe technique

19.44%(suitable for English character and Oriya text)

– The comparing Thai character technique65.25%– The sorting and distinguishing technique97.11%(complex, many steps to be proved)

10/21/2009 SNLP2009 29

Page 30: Optimization of Line Segmentation Techniques for Thai Handwritten Document

Thank you

Question & Answer

10/21/2009 SNLP2009 30