19
2003. 1. 28 2003. 1. 28 임임임 임임임 Managing Gigabytes Ch8. Mixed Text and Images Ch8. Mixed Text and Images

Ch8. Mixed Text and Images

Embed Size (px)

DESCRIPTION

Managing Gigabytes. Ch8. Mixed Text and Images. 2003. 1. 28 임성신. Introduction(1/2). This chapter examines the problem how to separate Text Line drawing (graphic) Halftone each component can be compressed effectively. Three steps in the Process. Orientation(8.1) - PowerPoint PPT Presentation

Citation preview

Page 1: Ch8. Mixed Text and Images

2003. 1. 282003. 1. 28

임성신임성신

Managing Gigabytes

Ch8. Mixed Text and ImagesCh8. Mixed Text and Images

Page 2: Ch8. Mixed Text and Images

2Artificial Intelligence Laboratory

Introduction(1/2)Introduction(1/2)

This chapter examines the problem how to separateThis chapter examines the problem how to separate Text Line drawing (graphic) Halftone

each component can be compressed effectively.each component can be compressed effectively.

Three steps in the Process.Three steps in the Process. Orientation(8.1)

• examine the orientation and correct it for skew

Segmentation(8.2)• Segment the document into visually distinct regions

Classification(8.3)• Classify the regions into text, line drawing, halftone

Page 3: Ch8. Mixed Text and Images

3Artificial Intelligence Laboratory

Introduction(2/2)Introduction(2/2)

Page 4: Ch8. Mixed Text and Images

4Artificial Intelligence Laboratory

8.1 Orientation8.1 Orientation

Tree approaches have been proposed for determininTree approaches have been proposed for determining the orientation, or skew angle, of a document imagg the orientation, or skew angle, of a document image.e. The first is to examine the left margin of the text The second is to look down the page for the gaps – leading The third is to look at the slopes of imaginary lines joining p

airs of marks on the page and see

Hough transformHough transform Assists in locating straight lines(or curves) in image

Page 5: Ch8. Mixed Text and Images

5Artificial Intelligence Laboratory

Orientation - Detecting straight lines using the Orientation - Detecting straight lines using the Hough transform Hough transform Hough transform is a line-to-point transformationHough transform is a line-to-point transformation

직선에 놓여있는 픽셀을 검출하는데 사용

nn 개로 구성된 영상에서 같은 직선에 놓여있는 점을 개로 구성된 영상에서 같은 직선에 놓여있는 점을 발견하는 방법발견하는 방법 모든 쌍의 픽셀로부터 두 점을 연결하는 직선을 구하고 다시

직선과 직선을 비교하여 비슷하거나 같은 것을 비교하여 같은 것들을 한 직선을 나타내는 부분집합으로 분류

• 비교해야 할 직선의 개수 : n(n-1)/2 ≈n2/2

• 이 방법은 너무 계산량이 많다 .

Page 6: Ch8. Mixed Text and Images

6Artificial Intelligence Laboratory

Orientation - Detecting straight lines using the Orientation - Detecting straight lines using the Hough transform Hough transform nn 개로 구성된 영상에서 같은 직선에 놓여있는 점을 발견하는 개로 구성된 영상에서 같은 직선에 놓여있는 점을 발견하는

방법방법 Hough Transform 1

• 직선의 방정식 y = ax + b 를 이용하여 매개변수 a, b 에 의해 영상에서 직선의 방정식을 추출해 내는 방법이다 .

• 픽셀 (xi, yi) 를 지나는 직선의 방정식 : yi = axi + b

• 픽셀 (xj, yj) 를 지나는 직선의 방정식 : yj = axj + b

• 위의 두 식에 의해 a, b 를 구해서

• yk = axk + b 를 만족하는 (xk, yk) 는 위의 두점을 지나는 직선을 지난다 .

• 이 원리로부터 b = yk - axk 에 따라 a 변수의 값 변화에 따라 b 가 같은 픽셀들을 선택할 수 있다 .

Page 7: Ch8. Mixed Text and Images

7Artificial Intelligence Laboratory

Orientation - Detecting straight lines using the Orientation - Detecting straight lines using the Hough transform Hough transform nn 개로 구성된 영상에서 같은 직선에 놓여있는 점을 개로 구성된 영상에서 같은 직선에 놓여있는 점을

발견하는 방법발견하는 방법 Hough Transform 2

• 위의 Hough Transform 1 에서 직선이 y 축에 평행해 갈 때 매개변수 a, b 가 무한대에 가까워지는 문제점을 해결하기 위해 두 번째 알고리즘은 직선의 식을

• xcosθ + ysinθ = ρ 로 바꾸어 직선을 각도 θ 와 거리 ρ 의 매개변수로 표현한다 .

• 이 방법 또한 Hough Transform 1 에서와 유사한 방법으로 a, b 대신에 θ, ρ 을 매개변수로 2 차원 배열을 구성하고 적당한 간격을 둔다 . 예를 들어 각도 θ 는 1 도씩 간격을 두고 범위는 1 ~ 180 도 , 거리 ρ 는 간격 1 씩 범위는 전체 영상의 크기를 포용할 수 있을 만큼 또는 검색하고자 원하는 범위만큼 크기를 정한다 . 알고리즘은 각도 θ 를 고정하고 모든 픽셀을 대입하여 ρ 를 구하고 ρ 가 범위안에 있으면 배열 A[θ][ρ] 를 1 증가시키고 각도 θ 범위안에서 θ 를 증가시킨 후 위의 동작을 반복한다 .

Page 8: Ch8. Mixed Text and Images

8Artificial Intelligence Laboratory

Orientation - Detecting straight lines using the Orientation - Detecting straight lines using the Hough transform Hough transform

Page 9: Ch8. Mixed Text and Images

9Artificial Intelligence Laboratory

Orientation – Left-margin searchOrientation – Left-margin search

return to our main problem : return to our main problem : detecting the orientation, or detecting the orientation, or skew angle , of a document skew angle , of a document imageimage

Most printed documents Most printed documents have an easily identifiable have an easily identifiable vertical left margin.vertical left margin.

Unfortunately, this method Unfortunately, this method is not robustis not robust sometimes left margin is

broken up by illustration the text may not be left-

justified

Page 10: Ch8. Mixed Text and Images

10Artificial Intelligence Laboratory

Orientation – The projection profileOrientation – The projection profile

수직축을 기준으로 수직축을 기준으로 각 라인당 각 라인당 black pixblack pixelel 의 수를 계산하고 의 수를 계산하고 그 결과를 그 결과를 히스토그램으로 검사히스토그램으로 검사 ..

If the document is If the document is ooriented correctlyriented correctly, va, valleys will occur in thlleys will occur in the histogram the linee histogram the lines of text.s of text.

Page 11: Ch8. Mixed Text and Images

11Artificial Intelligence Laboratory

Orientation – The projection profileOrientation – The projection profile

Autocorrelation functionAutocorrelation function of the histogram of the histogram The sharpness of the valleys can be quantified

n

n

nh

knhnhk

2)(

)()()(

h(n) : the value of the histogram at vertical position n

Page 12: Ch8. Mixed Text and Images

12Artificial Intelligence Laboratory

Orientation – From slope histogram to docstruOrientation – From slope histogram to docstrumm Histogram based techniques Histogram based techniques

that reflect the pairwisw relatthat reflect the pairwisw relationship between marks in thionship between marks in the image.e image.

Page 13: Ch8. Mixed Text and Images

13Artificial Intelligence Laboratory

Orientation – From slope histogram to docstruOrientation – From slope histogram to docstrumm

Page 14: Ch8. Mixed Text and Images

14Artificial Intelligence Laboratory

8.2 Segmentation8.2 Segmentation

To divide the document image into regions that To divide the document image into regions that contain either text, graphics, or a halftone picture.contain either text, graphics, or a halftone picture.

Three critical issuesThree critical issues Regions Scale Prior information

Page 15: Ch8. Mixed Text and Images

15Artificial Intelligence Laboratory

Segmentation – Bottom-up segmentation Segmentation – Bottom-up segmentation methodsmethods run-length smoothing run-length smoothing

algorithm – blurring, algorithm – blurring, smearingsmearing

Page 16: Ch8. Mixed Text and Images

16Artificial Intelligence Laboratory

Segmentation – Top-down and combined Segmentation – Top-down and combined segmentation methodssegmentation methods

Page 17: Ch8. Mixed Text and Images

17Artificial Intelligence Laboratory

Segmentation – Mark-based segmentationSegmentation – Mark-based segmentation

Page 18: Ch8. Mixed Text and Images

18Artificial Intelligence Laboratory

Segmentation – Segmentation using a documeSegmentation – Segmentation using a document grammarnt grammar

Page 19: Ch8. Mixed Text and Images

19Artificial Intelligence Laboratory

8.3 Classification8.3 Classification

Analyzing the layout is to Analyzing the layout is to classify the regions as text, classify the regions as text, line drawings, and halftone line drawings, and halftone images.images.