41
Joint Depth Map and Color Consistency Estimation for Stereo Images with Different Illuminations and Cameras Yong Seok Heo, Kyoung Mu Lee and Sang Uk Lee IEEE Transactions on Pattern Analysis and Machine Intelligence 2012

Yong Seok Heo , Kyoung Mu Lee and Sang Uk Lee

  • Upload
    asher

  • View
    74

  • Download
    0

Embed Size (px)

DESCRIPTION

Joint Depth Map and Color Consistency Estimation for Stereo Images with Different Illuminations and Cameras. Yong Seok Heo , Kyoung Mu Lee and Sang Uk Lee IEEE Transactions on Pattern Analysis and Machine Intelligence 2012. Overview. Introduction Relate Work Algorithm - PowerPoint PPT Presentation

Citation preview

Page 1: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Joint Depth Map and Color Consistency Estimation for Stereo Images with Different Illuminations and

Cameras

Yong Seok Heo, Kyoung Mu Lee and Sang Uk Lee

IEEE Transactions on Pattern Analysis and Machine Intelligence 2012

Page 2: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Overview• Introduction• Relate Work• Algorithm

• Experimental Results

Page 3: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Introduction (1/3)• image color values can be easily affected by radiometric

variations including global intensity changes and local intensity changes (caused by varying light, vignetting and non-Lambertian surface) and noise.

• For stereo matching, most algorithms assume radiometrically calibrated images

• However, there exist many real and practical situations or challenging applications in which radiometric variations between stereo images are inevitable.

Page 4: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Introduction (2/3)

• For examples: 3D reconstruction of aerial images [1], general multiview stereo [4], 3D modeling with internet photos (e.g. Photo Tourism [5] and Photosynth [6]), and PhotoModeler [7], etc.

Page 5: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

• In general, color consistency and stereo matching are a chicken-and-egg problem.

• Color consistency can enhance the performance of stereo matching, while accurate disparity maps can improve the color consistency or constancy.

• In this paper, new iterative framework that infers both accurate disparity maps and colorconsistent images for radiometrically varying stereo images are proposed.

Introduction (3/3)

Page 6: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Relate Work

• Stereo Matching:– Census transform (7 × 7) [41]–Mutual Information (MI) [22]– Adaptive Normalized Cross Correlation (ANCC)

[15]

• Color Consistency:– Color Histogram Equalized (CHE) [39]

Page 7: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Non-parametric Local Transforms for Computing Visual Correspondence [41]

• windows.• Rank Transform:– Non-parametric measure of local intensity.– Using the number of pixels in the local region whose

intensity is less than the center pixel.• Census Transform:– Non-parametric summarizes of local image structure.– Using the number of neighboring pixels whose intensity is

less than the center pixel.

R. Zabih and J. Woodfill, in Proc. European Conference on Computer Vision, 1994.

Page 8: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Stereo Processing by SemiglobalMatching and Mutual Information [22]

: Joint probability distribution

H. Hirschmuller, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 2, pp. 328–341, 2008.

Page 9: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Robust Stereo Matching Using Adaptive Normalized Cross-Correlation [15]

• The whole window information around matching pixels is used by the NCC in order to find the mean and standard deviation.

• Weight distribution information around matching pixels using the bilateral filter.

Y. S. Heo, K. M. Lee, and S. U. Lee, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 4, pp. 807–822, 2011.

Page 10: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Log-chromaticity Color Space [15](1/2)

• : brightness factor• : illuminant color factor of channel • : gamma correction factor

[15] Y. S. Heo, K. M. Lee, and S. U. Lee, “Robust stereo matching using adaptive normalized cross-correlation,” IEEE Trans. Pattern Analysis and Machine Intelligencea

Page 11: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Log-chromaticity Color Space (2/2)

• and are constant for each channel k• is an invariant color value for pixel p under

radiometric variations.

• M : multiplication factor which is set to 500.

Page 12: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

SIFT

• The SIFT descriptor for each pixel is computed with this color value for each channel and with the intensity (gray) value of the original image . We denote and as SIFT descriptors.

Page 13: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

• Joint probability density function (pdf) represents the statistical relationship between the left and right image color values.

• compute the joint pdf by means of the SIFT descriptors rather than pixel values to encode the spatial gradient information.

Joint Probability Density Function (1/2)

Page 14: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Joint Probability Density Function (2/2)

• : normalization constant• SIFT-weighting factor :

• : Euclidean distance• SIFT descriptors

Page 15: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Linear Function Estimation

• The log-chromaticity color space is linear, a linear function can be fitted to the joint pdf.

• To find the linear functions, we use the Huber distance function ρ(r):

• : the distance between the line and the Point• : the constant threshold parameter ()• Using the OpenCV function ‘cvFitLine()’.

Page 16: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Disparity Map Estimation (1/9)

• : data energy• : smoothness energy

• N : local four-neighborhood system• : the data cost that encodes the penalty for the

dissimilarity of corresponding pixels• : the smoothness cost that penalizes the discontinuity

of disparities between neighboring pixels. [34] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222–1239, 2001.

Page 17: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

• We combine the mutual information and the SIFT descriptor in our data cost [11], [12].

• Where more weight is given to than to further emphasize features in the log-chromaticity color values.

• The SIFT term has important role at the first iteration.

Disparity Map Estimation (2/9)

Page 18: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

• and are mutual information (MI) terms for log-chromaticity color and original gray images using

Disparity Map Estimation (3/9)

Page 19: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

• In (10), is the adaptive weighting factor of a pixel p between and :

[35] X. Hu and P. Mordohai, “Evaluation of stereo confidence indoors and outdoors,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2010.

Disparity Map Estimation (4/9)

Page 20: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Disparity Map Estimation (5/9)

Page 21: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

• MI tends to make some regions over-smooth, while SIFT can also blur some boundaries and is weak at textureless regions.

• We incorporated segment-based plane-fitting constraints [36], [37] to produce sharper and more accurate disparity map.

• In this framework, the disparity f is parameterized as .• Before extracting 3-D plane parameter for each segment, the

mean-shift segmentation method [38] is applied to the left and right original color images, independently.

[36] L. Hong and G. Chen, “Segment-based stereo matching using graph cuts,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2004.[37] J. Sun, Y. Li, S. B. Kang, and H.-Y. Shum, “Symmetric stereo matching for occlusion handling,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2005.[38] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603–619, 2001.

Disparity Map Estimation (6/9)

Page 22: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

• In the first step, initial plane-fitting is performed using only reliable pixels for each segment.

• This solution is sensitive to outliers. Hence, using the estimated plane parameter, we re-compute disparity value in the range for each pixel:

Disparity Map Estimation (7/9)

Page 23: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

• After the initial plane-fitting, we perform a refined plane-fitting scheme to find more accurate plane parameter for each segment s.

• : the number of reliable pixels• : the number of pixels that have the same

disparities as the disparity map obtained from (8).

Disparity Map Estimation (8/9)

Page 24: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

• :weighting factor• : constant occlusion cost

• Using (17) and (18), the total energy in (8) is minimized by the Graph-cuts expansion algorithm[34].

[34] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222–1239, 2001.

Disparity map estimation (9/9)

Page 25: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Fast approximate energyminimization via graph cuts [34]

• The minimum cut problem is to find the cut with smallest cost. There are numerous algorithms for this problem with low-order polynomial complexity.

• Using two move to refine that: swap and expansion.

T. Gevers and H. Stokman, IEEE Trans. Pattern Analysis and Machine Intelligence, Jan. 2004.

Page 26: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Occlusion Map Estimation [37]

• We define which is a binary map for the left image.

• is computed by warping to the left image, and assigning ‘0’ to visiting pixels, and ‘1’ to non-visiting pixels.

• and : the weighting factors, set as 4.0 and 1.4.

[37] J. Sun, Y. Li, S. B. Kang, and H.-Y. Shum, “Symmetric stereo matching for occlusion handling,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2005.

Page 27: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

• The color histogram equalization (CHE) method was proposed in [39].

• : the total number of pixels in the image• = 255 : the maximum value of color• If is invariant under any illumination change, then

the is also invariant.• This method is stable only under global radiometric

changes such as exposure or gamma changes.

Color Histogram Equalization (CHE)

Page 28: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Stereo Color Histogram Equalization (SCHE)

• Where is the transformed value of using the estimated linear function from the joint pdf.

Page 29: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Boosting the Disparity Map Estimation Using SCHE Images

• The color consistency of the SCHE images is based on the accurate disparity map estimation.

• Conversely, the estimation of the disparity map can benefit from the color-consistent SCHE stereo images.

Page 30: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Experimental Results• Using various test images with ground truth

disparity map such as Aloe, Dolls, Moebius, Art, Laundry, Reindeer, Rocks1, and Cloth4 dataset in [40] that have different radiometric variations.

• Each data set has three different camera exposures (0~2) and three different configurations of the light source (1~3).

• The total running time of our method for Aloe images (size : 427 × 370, disparity range : 0-70), for example, is about 4 minutes on a PC with Intel(R) Core(TM) i7-2600K 4.5GHz CPU.

“http://vision.middlebury.edu/stereo/,” 2012.

Page 31: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Experimental for SCHE

Page 32: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee
Page 33: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Color consistency performance• Computed the RMSE [39] images

performed individually for the left and right original images.

• ‘CHE1’ means RMSE for the stereo images after individually performing CHE using original input stereo images.

• ‘CHE2’ means RMSE for the stereo images after individually performing CHE using stereo images in the log-chromaticity color space.

[39] G. Finlayson, S. Hordley, G. Schaefer, and G. Y. Tian, “Illuminant and device invariant colour using histogram equalisation,” Pattern Recognition, vol. 38, no. 2, pp. 179–190, 2005.

Page 34: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Effects of SIFT Weight in MI Computation• We turned off both the SIFT term in (10) and

the plane-fitting constraint in (17) from our data cost.

Page 35: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Effects of Adaptive Weight• Compared the results using this preliminary data cost in (10)

by varying which is fixed as the same value for all pixels.

Page 36: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Stereo Matching Performance Comparison

• Census transform [41]

• Mutual Information (MI) [22]

• Normalized Cross Correlation (NCC) and Adaptive Normalized Cross Correlation (ANCC) [15].

• To evaluate the effects of exposure changes, we only changed the index of exposure.

Page 37: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Stereo Matching Performance Comparison

Page 38: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee
Page 39: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Different Configurations of the Light Source

• Only changed the index of the light configuration while fixing the exposure.

Page 40: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee
Page 41: Yong  Seok Heo , Kyoung  Mu  Lee and  Sang  Uk Lee

Tests for Scenes With Different Cameras

• The left images were taken by Canon IXUS 870 IS, and the right images were taken by Sony Cyber-shot DSC-W570.

• The left images were taken with flash, while the right images were taken without flash.

• The exposure times of the left and right images were set 1/60 sec and 1/100 sec, respectively.