Upload
vladimir-kulyukin
View
97
Download
5
Tags:
Embed Size (px)
Citation preview
Wavelets & Wavelet Algorithms
Blur Detection, Text Angle Detection, Texture Recognitionwith
2D Haar Wavelets
Vladimir Kulyukin
www.vkedco.blogspot.comwww.vkedco.blogspot.com
Outline
● Blur Detection● Text Skew Angle Detection● Texture Recognition
Blur Detection
with
2D Haar Wavelet Transform
Theoretical Foundations
● Mallat & Hwang [1] argue that signals carry information via irregularites
● These researchers show that the local maxima of the wavelet transform detect locations of irregularites
● For example, 2D HWT maxima indicate possible locations of edges in images
Indirect & Direct Blur Detection Methods● Tong et al. [2] classify image blur detection methods into direct and
indirect
● Indirect methods characterize image blur as a linear function IB = B
* IO
+ N, where IO
is the original image, B is an unknown blur
function, N is a noise function, and is the result image blur and noise are introduced in the image
● Direct methods are based on detection of distinct features directly computed in the images, e.g., corners, edges, color histograms, etc
Edge Classification
Tong, H., Li, M., Zhang, H., and Zhang, C. "Blur detection for digital images using wavelet transform," In Proceedings of the IEEE International Conference on Multimedia and Expo, vol.1, pp. 27-30, June 2004.doi: 10.1109/ICME.2004.1394114.
Edge-Based Blur Detection
● Tong et al. [2] propose a direct method based on 2D Haar Wavelet Transform
● Main assumption of their research is that introduction of blur has different effects on the four main types of edges
● In blurred images, Dirac and A-Step edges are absent whereas G-Step and Roof edges lose their sharpness
● Images are classified as blurred on the basis of presence/absence of Dirac & A-Step edges
Blur Detection
Haar Tile Clustering
Another 2D HWT-Based Blur Estimation Method
● Another 2D HWT- based blur estimation method is presented in [3]● This method is based on the hypothesis that it may not be necessary to
detect any explicit features such as corners or edges● Rather, it may be possible to detect regions with pronounced changes
without explicitly computing the causes of those changes● After those regions are detected, they can be combined into larger
segments● Those larger segments can be used to classify images as blurred
Theory
● Sharp images have many easily distinguishable features, e.g., edges, corners, textures
● Blurred images have fewer easily distinguishable features because distinctions among image regions are less pronounced
● Easily distinguishable features consists of regions with pronounced changes
Finding Regions with Pronounced Changes
● 2D HWT can be used to find image regions with pronounced horizontal, vertical, or diagonal changes
● A captured frame is divided into N x N windows (aka tiles) where N is an integral power of 2
● Border pixels at the right and bottom margins are discarded when captured frames are not divisible by N
Finding Regions with Pronounced Changes
Image is split into 64 x 64 blocks (aka tiles)
Tile Processing
● Each tile is processed by four iterations of the 2D HWT● The number of iterations is a parameter, and can be
increased/decreased● Each tile is represented by three 2D Haar wavelets:
horizontal change (HC), vertical change (VC), and diagonal change (DC)
● These values can be thresholded to retain the tiles with only large changes
Applying 2D HWT to Image
● Image on the right shows two iterations of ordered 2D HWT
● The whiteness of each pixel is proportional to the magnitude of the wavelet in the corresponding array cell
Representation of 64 x 64 Blocks with 2D Wavelets
Each region is represented in terms of 3 wavelets: vertical (first double); horizontal (second double), diagonal (third double); Ignore the numbers after “|”: they are used for debugging
Tile Clustering● After the tiles with pronounced changes are found, the depth-first search
(DFS) is used to combine them into larger tile clusters● DFS starts with an unmarked tile with a pronounced change and connects
to it its immediate horizontal, vertical, and diagonal neighbors if they also have pronounced changes
● If such tiles are found, they are marked with the same cluster number (shown with a specific color on the subsequent slides) and the search continues recursively
● After all tiles reachable from the current tile are found, the algorithm looks for another unmarked tile
● The algorithm terminates when no more unmarked tiles are found
Tile Cluster Buildup with DFS
● Let us assume that we have found all tiles with pronounced areas, as shown in the image on the right
● Now we can run DFS to find all tile clusters
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Cluster Buildup with DFS
Tile Clustering in Sharp Images
Eventually we run out of unmarked tiles with sufficiently high pronounced changes, as shown in the image on the right
Tile Clustering in Blurred Images
Title clusters shown in the right image are found by DFS run on tiles with pronounced changes found in the left image
Tile Cluster Filtering● After the tile clusters are found, two cluster-related rules are used to
classify a whole image as sharp or blurred● The 1st rule uses the percentage of the total area of the image covered by
the found tile clusters● The 2nd rule uses the number of the tiles in each cluster to discard small
clusters● The 1st rule captures the intuition that sharper images have many tiles with
pronounced changes● The 2nd rule captures the intuition that small clusters should be discarded
as irrelevant
Haar Tile Clustering: Algorithmic ChainTake Image
Haar Tile Clustering: Algorithmic ChainTake Image Find Tiles
Haar Tile Clustering: Algorithmic ChainTake Image Find Tiles Find Tile Clusters
Haar Tile Clustering: Algorithmic ChainTake Image Find Tiles Find Tile Clusters Filter Tile Clusters
Haar Tile Clustering: Algorithmic ChainTake Image Find Tiles Find Tile Clusters Filter Tile Clusters
BLURRED / SHARP
Haar Tile Clustering: Pseudocode
Lines 2 – 8: Split image into N x N blocks and run 2D HWT on each block
Lines 9 – 13: Find all tile clusters with DFS
Lines 15 – 20: Eliminate smaller clusters & compute overall area of larger clusters
Lines 21 – 22: Classify image as blurred or sharp
Blur Detection
Evaluation of Haar Tile Clustering
Image Sample
● 500 random RGB images* were selected from a set of 506 smartphone video recordings of common grocery products
● Three human volunteers were recruited to classify each image as blurred or sharp
● An image was classified as sharp if at least two of the three volunteers classified it as sharp – this is the ground truth
*Images are available at https://app.box.com/s/n4s2ve0dajz5gkzqx9vpm1f6fzhw5upz
Three Evaluated Algorithms
● Algorithm 1: Kulyukin, V. & Andhavarapu. S. “Image Blur Detection with 2D Haar Wavelet Transform and Its Effect on Skewed Barcode Scanning.” To appear in Proceedings of the 19th International Conference on Image Processing, Computer Vision, & Pattern Recognition (IPCV 2015). Las Vegas, NV, USA
● Algorithm 2: Tong, H., Li, M., Zhang, H., and Zhang, C. "Blur detection for digital images using wavelet transform," In Proceedings of the IEEE International Conference on Multimedia and Expo, vol.1, pp. 27-30, June 2004.doi: 10.1109/ICME.2004.1394114
● Algorithm 3: Blur effect: perception and estimation with a new no-reference perceptual blur metric.” In Proceedings of SPIE 6492, Human Vision and Electronic Imaging XII, 64920I, San Jose, CA, USA, January 28, 2007. doi:10.1117/12.702790
*Images are available at https://app.box.com/s/n4s2ve0dajz5gkzqx9vpm1f6fzhw5upz
True & False Positives on Blurred and Sharp Images
*Images are available at https://app.box.com/s/n4s2ve0dajz5gkzqx9vpm1f6fzhw5upz
Algorithm True Positives on Blurred Images
False Positives on Blurred Images
True Positives on Sharp Images
False Positives on Sharp Images
Algorithm 1 163 4 254 79
Algorithm 2 167 0 183 150
Algorithm 3 81 86 268 65
Ground Truth 167 0 333 0
Relative Difference Table
Algorithm Relative Difference on Blurred Images
Relative Difference on Sharp Images
Algorithm 1 2.39 23.72
Algorithm 2 0.00 45.05
Algorithm 3 51.50 19.52
images arpblurred/shon truth ground theis and algorithmby found
images arpblurred/sh ofnumber theis where,100,max
,fferenceRelativeDi
GA
AGA
GAGA
Observations
● On blurred images, Algorithm 1 & Algorithm 2 do not deviate from ground truth; Algorithm 3 shows a significant deviation (52%)
● On sharp images, Algorithm 1 & Algorithm 3 deviate from ground truth by 20% whereas Algorithm 2 deviates by 45%
Text Skew Angle Detection
with
2D Haar Wavelet Transform
Text Skew Angle Detection
● OCR engines have a lot of difficulties with skewed texts
● If the text skew angle is known, the image can be rotated and then OCRed
● Or, which is cooler and faster, image can be OCRed in-place without any rotation
Text Skew Angle Detection Algorithms
● A variety of algorithms have been developed to determine text skew angle
● Many of these algorithms use horizontal & vertical project profiles
● A horizontal projection profile is a 1D array whose size is equal to the number of rows in the image
● A vertical projection profile is a 1D array whose size is equal to the number of columns in the image
Horizontal & Vertical Projections
1 4 3 2 2 3 6 0Vertical Projection
Horizontal Projection
222
36222
1) Vertical projection records count of black pixels in each column
2) Horizontal projection records count of black pixels in each row
Computing Horizontal & Vertical Projections
● The image of every character from a given alphabet is rotated and horizontal & vertical projections are computed for every rotation
● In the example on the right the image of A is rotated by 90 degrees four times & two projections are computed for each angle
● Such projections are filed away and used at run time for text skew angle detection
0HP 0VP
90HP 90VP
180HP 180VP
270HP 270VP
Text Skew Angle Determination with Horizontal & Vertical Projections
● At run time, text is segmented into characters (this is not a trivial task, and is error-prone)
● A horizontal & vertical projection is computed for each character (or for selected characters)
● The computed projections are matched against the pre-computed horizontal & vertical projections
● The closest match determines the possible angle
Two Projection-Based Algorithms
● There are two seminal text skew angle detection algorithms: Postl's [7] & Hull's [8]
● Postl's algorithm calculates horizontal projection profiles for every character in the alphabet in small increments (e.g., in increments of 5 degrees) and uses the sum of squared differences for projection matching to determine the skew angle
● Hull's algorithm also uses projection profiles but rotates only black pixels instead of entire images
Computing Horizontal, Vertical, & Diagonal Wavelets
1) 2D HWT is applied to image a given number of times (twice in this case)
2) Application of 2D HWT returns an array of four n x n matrices [AVR, HC, VC, DC] (e.g., if the input image is 1024 x 1024, the size of each of the returned arrays is 256 x 256)
3) AVRG is matrix of averages; HOR is matrix of horizontal wavelets; VER is matrix of vertical wavelets; DIG is matrix of diagonal wavelets
crHC ,
crVC ,
crDC ,
Binarizing Horizontal, Vertical, & Diagonal Wavelets
HC, VC, DC matrices are binarized to eliminate irrelevant wavelets
Combining Horizontal, Vertical, & Diagonal Wavelets
Binarized HOR, VER, DIG matrices are combined into one matrix using the following formula
ly.respective matrices, wavelet diagonal and
vertical,,horizontal are ,,,,,
matrix,result theis ,,1 where
,,,,,
crDCcrVCcrHC
crC
crDCcrVCcrHCcrC
crHC ,
crVC ,
crDC ,
crC ,
Computing Text Skew Angle
Convex Hull algorithm is applied to the combined matrix to find the smallest rectangle around the text area; the rectangle is used to determine the text skew angle
crC ,
Text Skew Angle Detection: Algorithmic Chain
Take Image
Text Skew Angle Detection: Algorithmic Chain
Take Image Apply 2D HWT
Text Skew Angle Detection: Algorithmic Chain
Take Image Apply 2D HWT Binarize
Text Skew Angle Detection: Algorithmic Chain
Take Image Apply 2D HWT Binarize Combine
Text Skew Angle Detection: Algorithmic Chain
Take Image Apply 2D HWT Binarize Combine Find Rectangle
Text Skew Angle Detection: Algorithmic Chain
Take Image Apply 2D HWT Binarize Combine Find Rectangle
Angle SkewText
Text Skew Angle Computation: Pseudocode
Lines 1 – 4: Overall algorithm
Lines 5 – 14: Binarize Wavelets
Lines 16 – 26: Combine wavelet matrices; threshold combined pixel values; bind thresholded values with a rectangle; compute text skew angle
Image Sample
● 607 random RGB images* were selected from a set of 607 smartphone video recordings of common grocery products
● Two human volunteers were recruited to determine the text skew angle with an open source protractor program**
● The text skew angles determined are the ground truth
*Images are available at https://usu.app.box.com/s/9zk660t5h1g0dmw4pjj1x1yp6r7zovp3
**Open source protractor program http://sourceforge.net/projects/osprotractor/
Ground Truth with Open Source Protractor
Image on the right shows a humanevaluator using OpenSourceProtractor to estimate text skew angle
Three Evaluated Algorithms
● Algorithm 1: T. Zaman, V. Kulyukin. "Text Skew Angle Detetion in Vision-Based Scanning of Nutrition Labels." To appear in Proceedings of the 19th International Conference on Image Processing, Computer Vision, & Pattern Recognition (IPCV 2015). Las Vegas, NV, USA
● Algorithm 2: Postl, W. "Detection of linear oblique structures and skew scan in digitized documents." In Proc. of International Conference on Pattern Recognition, pp. 687-689, 1986
● Algorithm 3: Hull, J.J. "Document image skew detection: survey and annotated bibliography," In J.J. Hull, S.L. Taylor (eds.), Document Analysis Systems II, World Scientific Publishing Co., 1997, pp. 40-64
*Images are available at https://app.box.com/s/n4s2ve0dajz5gkzqx9vpm1f6fzhw5upz
Error Dispersion Plots
Algorithm 1 Algorithm 2 Algorithm 3
X-axis is the image numbers from 0 to 606;Y-axis is the text skew angle error compared to the ground truth (0 is the ground truth)
Performance Comparison Table
Algorithm 1 Algorithm 2 Algorithm 3
Time (ms) 341.37 6253.02 5908.18
Algorithm 1 Algorithm 2 Algorithm 3
Median Error 4.62 68.85 20.92
Table I. Processing time in milliseconds
Table II. Median error in text skew angle estimation
Observations
● Algorithm 1 has an average processing time of 341.37 ms, which is significantly faster than Algorithm 2 (Postl) and Algorithm 3 (Hull), because Algorithm 1 does not perform any rotation of images; for the sake of objectivity, it should be noted that Algorithms 2 & 3 were originally designed for document scanners with smaller text skew angles and ideal lighting conditions
● Algorithm 1 has a lower median text skew angle error than either Algorithm 2 or Algorithm 3
● Error dispersion plots show that Algorithm 1 shows closer clustering around the 0 line which is the ground truth than either Algorithm 2 or Algorithm 3
Texture Recognition
with
2D Haar Wavelet Transform
Hypothesis
● Different materials have different textures● Different textures may have different 2D wavelet coefficients● If 2D wavelet coefficients for specific textures are known,
they can be used at run to find the corresponding materials in images
Recognition of Food Items in Nutrition Images
Problem: Is it possible to recognize all food items and their (approximate regions) in images like the one above?
Recognition of Food Items in Nutrition Images
Possible Ideal Solution
Computing Wavelet Coefficients for Different Textures
● Let us manually crop 256 x 256 regions of three different textures (apple sauce, cake, and rice) and compute their Haar wavelet coefficients
● Let us also compute them at two different levels: 128 x 128 (7 iterations) and 64 x 64 (6 iterations)
● The first one (128 x 128) is coarser than the second one (64 x 64)
256 x 256 Images of Apple Sauce, Rice, & Meat
Apple Sauce Rice Meat
128 x 128 Apple Sauce Wavelets Apple Sauce 1 Apple Sauce 2 Apple Sauce 3
-0.55 1.45
0.27 1.72
-1.20 1.02
-1.01 -.2.05
0.84 -0.39
0.29 1.89
-0.06 -0.79
-0.18 2.36
-0.05 -0.36
-0.24 -1.93
0.67 -1.52
-0.92 0.22
1.00 -0.16
-1.65 0.67
2.25 -1.39
-0.77 2.37
2.26 -0.11
-081 -0.88
HC
VC
DC
128 x 128 Rice Wavelets Rice 1 Rice 2 Rice 3
-3.48 0.11
-0.77 13.50
-5.93 -6.56
0.49 -.6.64
-10.63 -6.51
-3.35 3.40
7.50 -7.44
5.95 0.21
-4.31 -5.53
-4.09 8.60
3.05 3.44
-0.06 -8.03
-2.94 -3.40
2.54 -4.09
3.39 -10.65
-3.83 -8.50
4.23 -2.93
-14.00 3.21
HC
VC
DC
128 x 128 Meat Wavelets Meat 1 Meat 2 Meat 3
-0.71 2.95
0.38 -0.69
0.80 2.18
-2.31 -0.78
-0.88 1.93
-2.06 0.87
0.21 -0.76
-0.05 3.00
3.54 1.16
0.33 -0.62
0.71 -0.45
0.76 -2.69
1.76 1.40
-1.92 1.76
1.39 2.50
3.31 1.75
0.03 0.98
-0.35 2.14
HC
VC
DC
128 x 128 Wavelets Side by Side: 1 Apple Sauce 1 Rice 1 Meat 1
-0.55 1.45
0.27 1.72
-1.20 1.02
-1.01 -.2.05
0.84 -0.39
0.29 1.89
HC
VC
DC
-3.48 0.11
-0.77 13.50
-5.93 -6.56
0.49 -.6.64
-10.63 -6.51
-3.35 3.40
-0.71 2.95
0.38 -0.69
0.80 2.18
-2.31 -0.78
-0.88 1.93
-2.06 0.87
128 x 128 Wavelets Side by Side: 2 Apple Sauce 2 Rice 2 Meat 2
HC
VC
DC
-0.06 -0.79
-0.18 2.36
-0.05 -0.36
-0.24 -1.93
0.67 -1.52
-0.92 0.22
7.50 -7.44
5.95 0.21
-4.31 -5.53
-4.09 8.60
3.05 3.44
-0.06 -8.03
0.21 -0.76
-0.05 3.00
3.54 1.16
0.33 -0.62
0.71 -0.45
0.76 -2.69
128 x 128 Wavelets Side by Side: 3 Apple Sauce 3 Rice 3 Meat 3
HC
VC
DC
2.25 -1.39
-0.77 2.37
1.00 -0.16
-1.65 0.67
2.26 -0.11
-081 -0.88
3.39 -10.65
-3.83 -8.50
-2.94 -3.40
2.54 -4.09
4.23 -2.93
-14.00 3.21
1.39 2.50
3.31 1.75
1.76 1.40
-1.92 1.76
0.03 0.98
-0.35 2.14
64 x 64 Wavelets Side by Side Apple Sauce 1 Rice 1 Meat 1
HC
VC
DC
0.95 -1.38 -1.04 1.53
-0.78 2.29 1.50 5.18
-0.36 0.41 0.00 6.53
0.00 -1.22 4.05 -0.41
2.87 -1.06 -0.91 -0.40
-1.50 -1.77 0.72 3.01
-0.20 -0.41 -0.40 0.69
0.15 -0.40 0.21 2.75
-0.35 0.91 -0.59 -0.71
-2.16 -2.25 1.16 -2.16
0.32 0.79 -1.39 2.84
-0.48 0.74 -1.87 -0.55
18/28 -6.43 11.50 -4.14
-8.31 3.19 1.29 -4.95
-7.60 3.21 -3.92 2.91
8.75 -5/58 2.86 -1.78
0.88 -2.58 -1.20 3.80
-2.73 4.24 0.10 1.43
-0.32 2.81 2.97 0.30
-2.90 4.28 -0.52 -4.83
0.63 1.77 13.11 0.58
-2.57 -8.22 -5.58 1.37
13.38 -0.31 5.07 5.26
-3.30 10.13 7.08 -1.18
-11.11 1.88 3.52 1.85
-1.57 -8.64 -9.23 6.73
-1.18 -4.43 -6.10 -4.09
-8.42 0.20 -2.50 9.80
2.94 2.06 0.22 1.42
3.57 -2.29 -1.19 -5.15
0.40 3.90 3.40 -0.80
-0.98 3.17 -1.37 1.73
0.63 -1.03 -1.77 -0.20
-1.00 -1.22 -0.96 -0.03
1.11 0.65 3.77 0.08
-7.55 3.96 0.77 -0.23
Book References
● Y. Nievergelt. “Wavelets Made Easy.” Birkhauser, 1999.● C. S. Burrus, R. A. Gopinath, H. Guo. “Introduction to
Wavelets and Wavelet Transforms: A Primer.” Prentice Hall, 1998.
● G. P. Tolstov. “Fourier Series.” Dover Publications, Inc. 1962.
Paper & Code References[1] Mallat, S. and Hwang, W. L. “Singularity detection and processing with wavelets.” IEEE Transactions on Information Theory, vol. 38, no. 2, March 1992, pp. 617-643.
[2] Tong, H., Li, M., Zhang, H., and Zhang, C. "Blur detection for digital images using wavelet transform," In Proceedings of the IEEE International Conference on Multimedia and Expo, vol.1, pp. 27-30, June 2004.doi: 10.1109/ICME.2004.1394114.
[3] Cretea,F., Dolmierea, T., Ladreta, P., Nicolas, M. “The Blur effect: perception and estimation with a new no-reference perceptual blur metric.” In Proceedings of SPIE 6492, Human Vision and Electronic Imaging XII, 64920I, San Jose, CA, USA, January 28, 2007. doi:10.1117/12.702790.
[4] Kulyukin, V. & Andhavarapu. S. “Image Blur Detection with 2D Haar Wavelet Transform and Its Effect on Skewed Barcode Scanning.” To appear in Proceedings of the 19th International Conference on Image Processing, Computer Vision, & Pattern Recognition (IPCV 2015). Las Vegas, NV, USA
[5] Python implementation of the blur detection algorithm proposed in reference [2] is available at https://github.com/VKEDCO/PYPL/blob/master/haar_blur
[6] MATLAB implementation of blur detection algorithm proposed in reference [3] is
available at http://www.mathworks.com/matlabcentral/fileexchange/24676-image-blur-metric
Paper & Code References[7] Postl, W. "Detection of linear oblique structures and skew scan in digitized documents." In Proc. of International Conference on Pattern Recognition, pp. 687-689, 1986.
[8] Hull, J.J. "Document image skew detection: survey and annotated bibliography," In J.J. Hull, S.L. Taylor (eds.), Document Analysis Systems II, World Scientific Publishing Co., 1997, pp. 40-64