Ch8. Mixed Text and Images

2003. 1. 282003. 1. 28

임성신임성신

Managing Gigabytes

Ch8. Mixed Text and ImagesCh8. Mixed Text and Images

2Artificial Intelligence Laboratory

Introduction(1/2)Introduction(1/2)

This chapter examines the problem how to separateThis chapter examines the problem how to separate Text Line drawing (graphic) Halftone

each component can be compressed effectively.each component can be compressed effectively.

Three steps in the Process.Three steps in the Process. Orientation(8.1)

• examine the orientation and correct it for skew

Segmentation(8.2)• Segment the document into visually distinct regions

Classification(8.3)• Classify the regions into text, line drawing, halftone


Introduction(2/2)Introduction(2/2)


8.1 Orientation8.1 Orientation

Tree approaches have been proposed for determininTree approaches have been proposed for determining the orientation, or skew angle, of a document imagg the orientation, or skew angle, of a document image.e. The first is to examine the left margin of the text The second is to look down the page for the gaps – leading The third is to look at the slopes of imaginary lines joining p

airs of marks on the page and see

Hough transformHough transform Assists in locating straight lines(or curves) in image


Orientation - Detecting straight lines using the Orientation - Detecting straight lines using the Hough transform Hough transform Hough transform is a line-to-point transformationHough transform is a line-to-point transformation

직선에 놓여있는 픽셀을 검출하는데 사용

nn 개로 구성된 영상에서 같은 직선에 놓여있는 점을 개로 구성된 영상에서 같은 직선에 놓여있는 점을 발견하는 방법발견하는 방법 모든 쌍의 픽셀로부터 두 점을 연결하는 직선을 구하고 다시

직선과 직선을 비교하여 비슷하거나 같은 것을 비교하여 같은 것들을 한 직선을 나타내는 부분집합으로 분류

• 비교해야 할 직선의 개수 : n(n-1)/2 ≈n2/2

• 이 방법은 너무 계산량이 많다 .


Orientation - Detecting straight lines using the Orientation - Detecting straight lines using the Hough transform Hough transform nn 개로 구성된 영상에서 같은 직선에 놓여있는 점을 발견하는 개로 구성된 영상에서 같은 직선에 놓여있는 점을 발견하는

방법방법 Hough Transform 1

• 직선의 방정식 y = ax + b 를 이용하여 매개변수 a, b 에 의해 영상에서 직선의 방정식을 추출해 내는 방법이다 .

• 픽셀 (xi, yi) 를 지나는 직선의 방정식 : yi = axi + b

• 픽셀 (xj, yj) 를 지나는 직선의 방정식 : yj = axj + b

• 위의 두 식에 의해 a, b 를 구해서

• yk = axk + b 를 만족하는 (xk, yk) 는 위의 두점을 지나는 직선을 지난다 .

• 이 원리로부터 b = yk - axk 에 따라 a 변수의 값 변화에 따라 b 가 같은 픽셀들을 선택할 수 있다 .


Orientation - Detecting straight lines using the Orientation - Detecting straight lines using the Hough transform Hough transform nn 개로 구성된 영상에서 같은 직선에 놓여있는 점을 개로 구성된 영상에서 같은 직선에 놓여있는 점을

발견하는 방법발견하는 방법 Hough Transform 2

• 위의 Hough Transform 1 에서 직선이 y 축에 평행해 갈 때 매개변수 a, b 가 무한대에 가까워지는 문제점을 해결하기 위해 두 번째 알고리즘은 직선의 식을

• xcosθ + ysinθ = ρ 로 바꾸어 직선을 각도 θ 와 거리 ρ 의 매개변수로 표현한다 .

• 이 방법 또한 Hough Transform 1 에서와 유사한 방법으로 a, b 대신에 θ, ρ 을 매개변수로 2 차원 배열을 구성하고 적당한 간격을 둔다 . 예를 들어 각도 θ 는 1 도씩 간격을 두고 범위는 1 ~ 180 도 , 거리 ρ 는 간격 1 씩 범위는 전체 영상의 크기를 포용할 수 있을 만큼 또는 검색하고자 원하는 범위만큼 크기를 정한다 . 알고리즘은 각도 θ 를 고정하고 모든 픽셀을 대입하여 ρ 를 구하고 ρ 가 범위안에 있으면 배열 A[θ][ρ] 를 1 증가시키고 각도 θ 범위안에서 θ 를 증가시킨 후 위의 동작을 반복한다 .


Orientation - Detecting straight lines using the Orientation - Detecting straight lines using the Hough transform Hough transform


Orientation – Left-margin searchOrientation – Left-margin search

return to our main problem : return to our main problem : detecting the orientation, or detecting the orientation, or skew angle , of a document skew angle , of a document imageimage

Most printed documents Most printed documents have an easily identifiable have an easily identifiable vertical left margin.vertical left margin.

Unfortunately, this method Unfortunately, this method is not robustis not robust sometimes left margin is

broken up by illustration the text may not be left-

justified


Orientation – The projection profileOrientation – The projection profile

수직축을 기준으로 수직축을 기준으로 각 라인당 각 라인당 black pixblack pixelel 의 수를 계산하고 의 수를 계산하고 그 결과를 그 결과를 히스토그램으로 검사히스토그램으로 검사 ..

If the document is If the document is ooriented correctlyriented correctly, va, valleys will occur in thlleys will occur in the histogram the linee histogram the lines of text.s of text.


Orientation – The projection profileOrientation – The projection profile

Autocorrelation functionAutocorrelation function of the histogram of the histogram The sharpness of the valleys can be quantified

n

n

nh

knhnhk

2)(

)()()(

h(n) : the value of the histogram at vertical position n


Orientation – From slope histogram to docstruOrientation – From slope histogram to docstrumm Histogram based techniques Histogram based techniques

that reflect the pairwisw relatthat reflect the pairwisw relationship between marks in thionship between marks in the image.e image.


Orientation – From slope histogram to docstruOrientation – From slope histogram to docstrumm


8.2 Segmentation8.2 Segmentation

To divide the document image into regions that To divide the document image into regions that contain either text, graphics, or a halftone picture.contain either text, graphics, or a halftone picture.

Three critical issuesThree critical issues Regions Scale Prior information


Segmentation – Bottom-up segmentation Segmentation – Bottom-up segmentation methodsmethods run-length smoothing run-length smoothing

algorithm – blurring, algorithm – blurring, smearingsmearing


Segmentation – Top-down and combined Segmentation – Top-down and combined segmentation methodssegmentation methods


Segmentation – Mark-based segmentationSegmentation – Mark-based segmentation


Segmentation – Segmentation using a documeSegmentation – Segmentation using a document grammarnt grammar


8.3 Classification8.3 Classification

Analyzing the layout is to Analyzing the layout is to classify the regions as text, classify the regions as text, line drawings, and halftone line drawings, and halftone images.images.

Documents

Ch8. Mixed Text and Images