12
Improvements of image fusion methods Yotam Ben-Shoshan Yitzhak Yitzhaky

Improvements of image fusion methods

Embed Size (px)

Citation preview

Improvements of image fusionmethods

Yotam Ben-ShoshanYitzhak Yitzhaky

Improvements of image fusion methods

Yotam Ben-Shoshan and Yitzhak Yitzhaky*Ben-Gurion University, Department of Electro-Optics Engineering, Beer-Sheva 84105, POB 653, Israel

Abstract. Fusion of images from different imaging modalities, obtained by conventional fusion methods, maycause artifacts, including destructive superposition and brightness irregularities, in certain cases. This paperproposes two methods for improving image multimodal fusion quality. Based on the finding that a better fusioncan be achieved when the images have a more positive correlation, the first method is a decision algorithm thatruns at the preprocessing fusion stage and determines whether a complementary gray level of one of the inputimages should be used instead of the original one. The second method is suitable for multiresolution fusion, andit suggests choosing only one image from the lowest-frequency sub-bands in the pyramids, instead of combiningvalues from both sub-bands. Experimental results indicate that the proposed fusion enhancement can reducefusion artifacts. Quantitative fusion quality measures that support this conclusion are shown.© 2014 SPIE and IS&T[DOI: 10.1117/1.JEI.23.2.023021]

Keywords: image fusion; multisensor fusion; multiresolution decomposition; image fusion evaluation.

Paper 13465P received Aug. 21, 2013; revised manuscript received Feb. 9, 2014; accepted for publication Mar. 24, 2014; publishedonline Apr. 25, 2014.

1 IntroductionTechniques that can effectively combine information fromdifferent sources have become a need due to the growingtypes and number of sensors on imaging systems, andbecause it is hard for humans to effectively integrate visualinformation by viewing multiple displays separately.1 Hence,there is a need for combining the streams of information intoone coherent representation in order to cope with the dataoverload. Fusion implementations are broadly common inmedical,2–4 security,5 aviation,6,7 geoscience,8 vehicles,9

and other applications.10 For example, heavy fog wouldblock a pilot’s vision, but by fusion of a laser imaging sensorimage to the pilot’s image, we could dramatically improvethe pilot’s vision. The same could be implemented incars, boats,6 and so on.

Conventional fusion methods, either arithmetic or multi-resolution, may suffer from artifacts stemming from thefusion process. As will be detailed in the next section,these artifacts include destructive superposition and bright-ness irregularities in the fused image.

In this paper, we suggest two techniques to decreasefusion artifacts when using fusion methods, such as arith-metic or multiresolution. The idea is to adapt parametersof the fusion process according to characteristics of thesource images and their corresponding properties. In thefirst method, we propose to determine whether a changeof polarity of one of the images (preferably the nonvisualone) can improve the fusion result,11 where a change ofpolarity is defined as turning every gray-level pixel intoits complementary (i.e., bright becomes dark and viceversa) in order to increase the positive intensity correlationbetween the images. The second method relates to multire-solution fusion techniques and decides which of the lowestfrequency sub-band from the two multiresolution decompo-sitions will be used in the fusion reconstruction process. Thelowest-frequency sub-band will be referred to as base image.

The two techniques do not necessarily constitute a singleframework and each of the methods can be implementedseparately.

The idea of inverting polarity in thermal imagery to betterresemble the visible band is well known. However, an auto-matic method is suggested here for this aim. Previous meth-ods that performed fusion of thermal and visual data (forvarious applications, such as in Refs. 12, 13, and 14) didnot consider an automatic polarity control.

In general, methods to evaluate the image quality can beroughly divided into two categories: reference-based meth-ods, such as peak signal-to-noise ratio, root mean squareerror, and the structural similarity,15 and no-reference-based methods, which are based on measures such as stan-dard deviation, entropy, cross entropy, mutual information,16

and more. In multimodal image fusion, such reference-basedmethods pointed above are not appropriate because an idealreference fused image usually does not exist. In order toassess our enhancement of the fusion methods also withobjective quantitative evaluation, we employed two popularexisting algorithms that aim for objective fusion evaluation.Although the ideal reference fused image does not exist, spe-cific features in the input images, such as edges or any othersalient features, should largely be preserved. Therefore, itcan be stated that when such features are used for the evalu-ation, they can be considered as reduced reference measures.We choose Xydeas and Petrovic’s quality measure,17 whichis based upon edge preservation property, and Wang andBovic’s quality measure,18 which compares between salientstructural characteristics of the input and the fused images.Both of these methods have been used in the literature forevaluation of fused images.19

The rest of this paper is organized as follows. Section 2gives a brief review of image fusion approaches and theirpotential reconstruction artifacts, where the first artifact(destructive superposition) is explained in Sec. 2.1.1. Thesecond artifact (brightness irregularities) is explained in

*Address all correspondence to: Yitzhak Yitzhaky, E-mail: [email protected] 0091-3286/2014/$25.00 © 2014 SPIE and IS&T

Journal of Electronic Imaging 023021-1 Mar–Apr 2014 • Vol. 23(2)

Journal of Electronic Imaging 23(2), 023021 (Mar–Apr 2014)

Sec. 2.2.1. In Sec. 3, the algorithm that determines the polar-ity of the images is described. Section 4 describes how tochoose the base images in multiresolution fusion methods.Fusion results and discussion are in Sec. 5. Conclusionsare given in Sec. 6.

2 Main Image Fusion Approaches and PotentialReconstruction Artifacts

In general, the objective of image fusion is to combine infor-mation from multiple input images of the same scene whilepreserving the salient features from each image. Imagefusion can be classified into three processing levels:20

pixel-level, feature-level, and decision-level. Image fusionat pixel-level means that the actual measured physicaldata are being fused on an individual pixel basis. Fusionat feature-level requires extraction of objects recognized inthe various data sources. Features correspond to characteris-tics extracted from the initial images, which are dependingon their environment. Fusion at decision-level represents amethod that uses value added data, where the input imagesare processed individually for information extraction. Theobtained information is then combined applying decisionrules to reinforce common interpretation and resolvedifferences and furnish a better understanding of theobserved objects. Pixel-based methods have the advantageagainst feature- or decision (information)-level fusion ofusing the most original data. They avoid a loss of informationthat may occur during a feature extraction process. However,as described below, fusion artifacts may occur when theinput images have certain properties. This paper dealswith pixel-level image fusion, where such artifacts mayoccur.

Two main pixel-level approaches for image fusion includearithmetic fusion and multiresolution fusion. These methodsand their possible artifacts are described below.

2.1 Arithmetic Fusion MethodsArithmetic fusion algorithms10,21 produce the fused image asan arithmetic combination of corresponding pixels in theinput images. Arithmetic fusion can be summarized bythe expression

Fðj; kÞ ¼ W1 · I1ðj; kÞ þW2 · I2ðj; kÞ þ C; (1)

where ðj; kÞ are the image indexes, I1 and I2 are the inputimages, F is the fused image, W1 and W2 are the relative ofeach pixel in the input images, and C is the mean offset. Forinstance,W1 ¼ 0.5,W2 ¼ 0.5, and C ¼ 0 produces the aver-age between the two images. Arithmetic methods are usually

simpler. Hence, they are computationally efficient. In addi-tion, they do not suffer from reconstruction artifacts, whichwill be discussed in Sec. 2.

2.1.1 Arithmetic fusion drawback

The main drawback of these methods is the destructivesuperposition, which is worst when two correspondingimage features have a reverse polarity relationship. The out-come is a loss of contrast. Figure 1 shows an example of adestructive superposition fusion. It can be seen that themarked object has a reverse polarity relationship betweenthe two source images [i.e., in the regular visual image inFig. 1(a) it appears bright, while in the thermal image inFig. 1(b) it appears dark]. Hence, the result of the superpo-sition between the two modalities creates a poor contrastimage [Fig. 1(c)].

2.2 Multiresolution Fusion MethodsMultiresolution fusion methods are generally based ondecomposition of the input images into a series of sub-band signals that are organized into pyramid levels ofdecreasing resolutions; each stage represents a differentpart of the spatial spectrum of the image. The high stagesof the pyramids represent coarse resolution information,while the lower stages represent finer resolution. Fusingimages by multiresolution methods enables the algorithmto take into account features from different scales (pyramidlevels) when creating the fused image. This property is espe-cially important when two features overlap and it is the mainreason for multiresolution methods to outperform arithmeticmethods. The second reason is because of the scale separa-tion (each scale is compared and fused separately) that limitsthe damage of a wrong feature selection. However, a maindrawback that may occur in multiresolution methods isdescribed in Sec. 2.2.1. In this paper, we have used someknown multiresolution methods for examining the proposedfusion enhancement techniques.

Multiresolution transform domain techniques present fea-tures of the images at different scales. The first multistateimage fusion was presented in 1985 by Burt and Adelson,which used the Laplace pyramids22 to merge images bytheir gradients magnitude. Toet and Valeton, in 1989,chose the contrast features to merge different modal imagesby using ratio of low-pass pyramids (ROLP).23 Nonlinear fil-ters, such as morphological operators, were used by Pei andChen in 1995 in order to decompose images.23,24 Waveletdecomposition approaches were first introduced by Liet al.,25 and they added versatility to the multiresolutionapproaches. Despite the success of classical wavelet

Fig. 1 Destructive superposition example. (a) and (b) Input images (visual and thermal, respectively).(c) Arithmetically fused image.

Journal of Electronic Imaging 023021-2 Mar–Apr 2014 • Vol. 23(2)

Ben-Shoshan and Yitzhaky: Improvements of image fusion methods

methods, some limitations reduce their effectiveness in cer-tain situations. For example, wavelets rely on a dictionary ofroughly isotropic elements and their basis functions are ori-ented only on a small number of directions due to the stan-dard tensor product construction in two dimensions. Thoselimitations have been tackled, for instance, by using undeci-mated wavelet transform with spectral factorization and non-orhogonal filter bank suggested by Ellmauthaler et al.26

2.2.1 Reconstruction artifacts in multiresolution fusionmethods

Objects in an image frequently contain smooth and some-what uniform surfaces near edges. In part of the cases,the same surfaces in the two images have opposite brightnesslevels (bright versus dark) as produced by the differentmodalities. During the multiresolution fusion reconstruction,often such locations are constructed by coefficients fromboth images due to the influence of the nearby edge,which causes local intensity variations at the different scales.As a result, discontinuities may occur in the fused sub-bandsignal,27 which will create brightness irregularities in thereconstruction. Even if only one coefficient is replaced byanother coefficient (during fusion process), it may causereconstruction artifact in the fused image.28 Figure 2 presentsa demonstration of the fusion brightness irregularities phe-nomenon. The two objects in the input images [Figs. 2(a)and 2(b)] have a reverse-polarity brightness value (blackin one image is white in the other). It can be seen in this

simple case of reverse-polarity images that brightness irregu-larities occur near the step-edge lines where a wide range ofspatial frequencies exist. The reason for it relates to the use oflow-pass filtering during the process of building the pyramidpixels of the step area. During the fusion process, the algo-rithm needs to decide for each pixel at each scale, fromwhich input image it needs to take its information.Usually the decision is based on the bigger absolute value.The reverse polarity around the edge creates gradients withopposite signs in different levels of the pyramids. Therefore,the corresponding pixels in the fused area around the edgemay also have opposite signs when the selected pixels at thatarea are not taken only from sub-bands of just one of theinput images. In Fig. 2(e), the regular “max” rule is usedfor the base images also. In addition to the brightness irregu-larities near the edges, the contrast between the two parts ofthe images is minimal, so the existence of two objects is notclear. In Figs. 2(f) and 2(g), the base images of Figs. 2(a) and2(b) were used, respectively. It can be seen that in both cases,the existence of two objects is clear, but there are also bright-ness irregularities near the edges. These irregularities disap-pear in Fig. 2(h), where a reverse polarity is performed onone of the images.

3 Determination of Image PolarityWe showed in Sec. 2 the significant drawbacks that mayoccur in both mentioned fusion approaches. Arithmeticfusion methods may cause loss of contrast, and multiresolu-tion methods may suffer from brightness irregularities

Fig. 2 Reverse-polarity images and fusion artifacts. (a) and (b) Artificial input images. (c) and (d) Laplacepyramids of input images (a) and (b), respectively. (e) Laplace fusion using “max” fusion selection rule forhighest scale image (the regular procedure s). (f) Laplace fusion using image (a) as the base image forreconstruction. (g) Laplace fusion using image (b) as the base image for reconstruction. (h) Laplacefusion of image (a) and a reversed polarity of image (b).

Journal of Electronic Imaging 023021-3 Mar–Apr 2014 • Vol. 23(2)

Ben-Shoshan and Yitzhaky: Improvements of image fusion methods

reconstruction artifacts. Both types of drawbacks are espe-cially noticeable when the fused features are in a reversepolarity relationship. In the arithmetic fusion case, pixelswill cancel each other, while in the multiresolution fusioncase, larger brightness irregularities will be caused by coef-ficients that stem from areas in the original images that havereverse polarity. Hence, we postulate that in both cases,fusion results can be improved by making the imagesfrom the different modals look more similar. In order todo that correctly, we must use a global transformation onthe image; otherwise it will change the image information.The idea of our fusion modification technique is to decide, inthe preprocessing stage, if changing the polarity of one inputimage will improve the fusion result.

3.1 Increasing Positive CorrelationTo make this decision, a two-dimensional space is con-structed (Fig. 3), at which each point represents the valuesof the same pixel location at the two corresponding images(the horizontal and vertical coordinates in this space are thegray-level values at the two images). For example, a point atthe middle of the space means a pixel with the same, 128gray-level at the two images. This two-dimensional spaceenables us to better understand the gray-level relationshipsbetween the two images. Practically, when looking at apair of images, we would count the numbers of pixelpairs that are in a positive correlation and how many havenegative correlation. If more pairs of pixels satisfy negativecorrelation relationship, then a change of the polarity for oneof the sensors can improve fusion quality. The change ofpolarity should be performed on the image in which itwill not harm the conventional perception of the image infor-mation; hence, polarity changes should be performed onimages that were not recorded at the visual band, but atother bands, such as the thermal.

3.1.1 Mask design considerations

As stated, the mask role is to determine if changing the polar-ity of one of the images will increase the positive correlationbetween the input images. The mask explains graphically the

kind of relationship we aim to recognize by dividing thetwo-dimensional space into three categories.

1. High positive correlation areas marked as blue regions(locations where both images and pixel values are>80% or <20% of the intensity range).

2. High negative correlation areas marked as red regions(locations where pixel values are >80% of the inten-sity range in one image and <20% of the range in theother image).

3. Low correlation areas that do not give us significantinformation of whether changing polarity wouldimprove the fusion results.

The reason that we take into account only 20% of the highand low gray levels is that the pixels that would be mostaffected from the change of polarity operation are thosethat have significant dark or bright values, while mid-levelgray pixels would be less affected. For example, in themost extreme cases, if we change the polarity of a pixelthat has gray level equals 128, the result would be 127,while polarization on 0 gives 255. The 20% value waschosen empirically. Note that the 20% highest gray levelscontrol only the decision about the polarity change, butthe change itself is performed to all the pixels in the image.

Figures 4 and 5 present an example of our polarity deter-mination method. Figures 4(a) and 4(b) show, respectively,visual and thermal images of a runway (obtained fromRef. 28). Figure 4(c) shows the gray-level correspondencespace for these two images. It can be easily seen that inthis case, there are more points that indicate negative corre-lation (the points at the upper-left square) than points thatindicate positive correlation (the points at the upper-rightsquare). Figure 5 shows the result after we change polarityof the thermal image. Figure 5(c) shows the changed-polaritythermal image, compared to the same visual image inFig. 5(a). In this case, it is clearly seen in Fig. 5(c) thatmore points indicate positive correlation (the points at theupper-right square) than points that indicate negative corre-lation (the points at the upper-left square). This means that achange of polarity would give better fusion results for theseimages.

200

50

20050

255

2550

Image B

Image A

A change of polarity will harm fusion

A change of polarity will improve fusion

A change of polarity less Affects fusion

Legend:

Fig. 3 A Two-dimensional gray-level correspondence space between image A (horizontal axis) andimage B (vertical axis) with a polarity decision mask, as defined in the legend.

Journal of Electronic Imaging 023021-4 Mar–Apr 2014 • Vol. 23(2)

Ben-Shoshan and Yitzhaky: Improvements of image fusion methods

4 Choosing Base Images in MultiresolutionFusion Methods

Multiresolution fusion methods transform the images into aset of image representations at different scales. Such repre-sentations enhance features such as object boundaries, whichare usually less dependent on the imaging modality.Therefore, such representations have an advantage in multi-modal image registration and fusion. As a result, the fusion isan easier task, mainly because the combining stage needs todecide between the same kinds of visual information. This

representation takes place at different scales for all the pyra-mid stages except for the highest pyramid stage with the low-est-frequency information (termed here as base image),which depends heavily on illumination and physical imagingproperties of the sensors. Hence, fusion between the twolowest-frequency image bands may eventually generate aconfusing fused image for the human observer. Accordingto the above, we concluded that fusion of the base imagewith the same rules as at the higher-frequency bands of thepyramids may harm the visual output in part of the cases.Figures 2(f), 2(e), and 2(g) demonstrate the visual effecton the fusion results when using different base images. Wecan see that in such a case, when the same rule of fusion isapplied to the base images, a loss of contrast occurs [Fig. 2(e)].On the other hand, when the base image was chosen, a highcontrast was preserved.

Our experiments indicate that in part of the cases, a betteraction is to choose the low-frequency band of only one ofthe pyramids. A good choice of the base image can bebased on subjective preferences; however, if an automaticprocess is desired, we propose to select the base imagethat belongs to the input image that contributed more infor-mation in the fusion process (i.e., at which more pixelswere selected to be included in the fused image). Thisinput image usually has more informative spatial features(at the different scales), and therefore, it is likely the morevisually dominant image.

In order to calculate which image contributes more visualinformation, we weight each pixel according to the scalelevel it has been taken from. A pixel from a high scaleinfluences more pixels in the fused image and, therefore,will have a bigger weight. For example, a pixel from the sec-ond scale of a fusion pyramid affects four pixels in the fusedimage. Therefore, it will have a weight four times higher thanthat of a pixel from the first scale.

We term the total contribution of an input image to thefused image as TCIF (excluding the contribution of the high-est-scale or lower-frequency band). TCIF is obtained by thesum of all the contributions of all the pixels in the differentscales normalized by the weighted total sum. Hence, TCIF ofimage A is defined as

TCIFA ¼XL−1i¼1

Xx

Xy

DAi ðx; yÞ · wiPL−1i¼1

Px

Pywi

; (2)

where L is the number of scales, x and y are the cordinates ofthe sub-band i, andDA

i ðx; yÞ is a decision matrix showing foreach pixel in the sub-band i whether it affects the fusedimage or not. For example, when the decision rule is themaximum between the absolute values, the decision matrixwill be obtained by

DAi ðx; yÞ ¼

�1; if jAiðx; yÞj > jBiðx; yÞ0; otherwise

; (3)

where Ai and Bi are the multiresulution decomposition sub-bands i of the input images A and B. Each pixel is weightedby wi, according to its scale level i.

Fig. 4 A case of negative correlation between input images. (a) Visualimage. (b) Infrared image (images taken from Ref. 28). (c) Thetwo-dimensional (2-D) polarity decision space (described in Fig. 3)showing the negative correlation.

Fig. 5 Positive correlation after changing image (b) polarity. (a) Visualimage. (b) Infrared image with reversed polarity. (c) The 2-D polaritydecision space (described in Fig. 3) showing the positive correlation.

Journal of Electronic Imaging 023021-5 Mar–Apr 2014 • Vol. 23(2)

Ben-Shoshan and Yitzhaky: Improvements of image fusion methods

wi ¼ 4ði−1Þ: (4)

Note that since the TCIF measure in Eq. (2) is normalized,TCIFB ¼ 1 − TCIFA. Eventually, a base input image wouldbe taken from the input image that its TCIF parameter is big-ger. A schematic diagram of the TCIF process for selectingthe base image is shown in Fig. 6. This formulation is suit-able for multiresolution decomposition methods, such asLaplace pyramids, ROLP, and morphological pyramids.However, for the wavelet fusion case, some adjustmentsshould be made, such as performing the sum of scales in

Eq. (2) until L and not L − 1, but excluding the lowest-frequency band (“low-low”) from the calculation.

Figure 7 presents an example of the effect of the use ofdifferent base images on the appearance of the fused image ina multiresolution fusion method (Laplace pyramid).Figures 7(a) and 7(b) show the two input images obtained,respectively, from a thermal imaging system and from alaser-gated imaging (LGI) system (images obtained fromElbit Systems Ltd). An LGI system illuminates the scenewith laser pulses (in the near-IR band) and records the reflec-tions from the scene.29 Figures 7(c), 7(d), and 7(e) arethe Laplace fusion results of the images in Figs. 7(a) and7(b), using different types of base images. In Fig. 7(c), itcan be seen that the illumination of that image was takenfrom Fig. 7(b). In Fig. 7(d), the dominant illumination istaken from Fig. 7(a), and in Fig. 7(e), the fusion rule wasapplied to the base images generating an image such thatits luminance is constructed from the maximal pixel values.It can be seen that using different base images in the fusionprocess has a vital influence on fused image luminance. Inthis example, the bigger TCIF varies according to the fusionmethod. For the Laplace fusion method, the LGI image TCIFis 0.72, while for the wavelet fusion method, the thermalTCIF is 0.53. For the undecimated wavelet method, theLGI image TCIF is 0.72, while for the wavelet fusionmethod, the thermal TCIF is 0.53. In each case, the TCIFfor the other input image is the complementary to one.

5 Experimental Results and AnalysisIn this section, the proposed modifications in the fusion proc-ess are demonstrated and evaluated. The performance of theproposed technique is tested and compared using the follow-ing fusion methods: an arithmetic average fusion, theLaplace pyramid fusion with five decomposition levels,22

a wavelet fusion25 with Haar wavelet functions and threedecomposition levels, and the undecimated waveletmethod26 using Haar wavelet functions and three decom-position levels. The fusion rule in the multiresolutionmethods was the “select-maximum.” Note that other wavelet

Fig. 6 A schematic diagram of the total contribution of an input imageto the fused image (TCIF) process for selecting the base image. Thetwo input images are represented by A and B. The TCIF formulation isaccording to Eq. (2).

Fig. 7 Impact of the base image selection on the resulting fusion: Laplace pyramid fusions of laser-gatedimaging (LGI) and thermal images (obtained from Elbit Systems Ltd) for different cases of base images.(a) Input thermal image. (b) Input LGI image. (c) Fusion where (b) is the base image. (d) Fusion where(a) is the base image. (e) Fusion where superposition of (a) and (b) is the base image.

Journal of Electronic Imaging 023021-6 Mar–Apr 2014 • Vol. 23(2)

Ben-Shoshan and Yitzhaky: Improvements of image fusion methods

Fig. 8 Laplace pyramid fusions of the visual and thermal images for different cases of polarity and baseimages. (a) Input visible image.28 (b) Input thermal image.28 (c) Reverse polarity of input (b) [neg (b)].(d) Fusion of (a) and (b), (a) is the base image. (e) Fusion of (a) and (b), (b) is the base image. (f) Fusion of(a) and (b), max[(a),(b)] is the base image. (g) Fusion of (a) and (c), (a) is the base image. (h) Fusion of (a)and (c), (c) is the base image. (i) Fusion of (a) and (c), max[(a),(c)] is the base.

Fig. 9 Laplace pyramid fusions of visual and thermal images for different cases of polarity and baseimages. (a) Input visible image.30 (b) Input thermal image.30 (c) Reverse polarity of input (b).(d) Fusion of (a) and (b), (a) is the base image. (e) Fusion of (a) and (b), (b) is the base image.(f) Fusion of (a) and (b), max[(a),(b)] is the base image. (g) Fusion of (a) and (c), (a) is the baseimage. (h) Fusion of (a) and (c), (c) is the base image. (i) Fusion of (a) and (c), max[(a),(c)] is thebase image.

Journal of Electronic Imaging 023021-7 Mar–Apr 2014 • Vol. 23(2)

Ben-Shoshan and Yitzhaky: Improvements of image fusion methods

functions have been examined for the wavelet-based meth-ods, and the performances of the proposed method werequite similar.

In order to include an objective evaluation of the results ofour method, the resulting fused images are evaluated byknown quality quantitative measures. Quantitative evaluationof the quality of a fused image is a difficult task, which mightbe application dependent. We choose two methods that mea-sure fused image quality for this aim:19 Xydeas andPetrovic’s measure,17 and Wang and Bovic’s-based mea-sure.18 Both of the methods measure amounts of informationtransferred from the input images to the fused image. Since aground truth ideal fused image does not exist, these methodscan only estimate the fusion quality based on the inputimages, which are considered as reduced reference. A qualitymeasure value closer to 1 indicates a better fused image,whereas a value closer to 0 indicates a poorly fusedimage. The measure of Xydeas and Petrovic, which isbased on an edge preservation property, considers the

added intensity variations at previously smooth regions asfalse (i.e., features that reduce the quality index) becausesuch features do not match similar features in the inputimages.

Figures 8, 9, and 10 show examples of various fusioncases obtained from different combinations of thermalimage polarity and base images, with regard to the decisionof the proposed method. In the case of visual/thermal fusion,we examined only the option of reversing the polarity of thethermal image because reversing the visual image may causenon-natural visual effects. For example, reversing the polar-ity of the visible image will make shadows look bright in thefused image [shadows exist, for example, in Fig. 9(a)].Figure 8 presents fusion results of the visual and thermalimages of the runway for different cases of polarity andbase images. It is pretty clear that a reverse polarity relation-ship exists between the runway marks of the two modalities(bright in the visible and dark in the thermal). Hence,reconstruction artifacts are most likely to take place when

Fig. 10 Laplace pyramid fusions of visual and thermal images for different cases of polarity and baseimages. (a) Input visible image.28 (b) Input thermal image.28 (c) Reverse polarity of input (b). (d) Fusion of(a) and (b), (a) is the base image. (e) Fusion of (a) and (b), (b) is the base image. (f) Fusion of (a) and (b),max[(a),(b)] is the base image. (g) Fusion of (a) and (c), (a) is the base image. (h) Fusion of (a) and (c),(c) is the base image. (i) Fusion of (a) and (c), max[(a),(c)] is the base image.

Journal of Electronic Imaging 023021-8 Mar–Apr 2014 • Vol. 23(2)

Ben-Shoshan and Yitzhaky: Improvements of image fusion methods

employing the fusion methods. In the example of Fig. 8, theproposed polarity determination algorithm (Sec. 3) haschosen to change the polarity of the visual image. It canbe seen visually that the fusion results support this decision(less reconstruction artifacts in the lower row with regard tothe middle row). Quantitative measures of the contrast infused images can be seen in Ref. 5. Since fusion methodsdo not necessarily produce similar results, the base imageselection algorithm (Sec. 4) also does not produce similardecisions, and depends on the results of each method.The Laplace fusion produced TCIF of 0.71 for the visualimage and 0.29 for the thermal. The wavelet fusion producedTCIF of 0.61 for the thermal image and 0.39 for the visual,and the undecimated wavelet produced TCIF of 0.53 for thethermal image and 0.47 for the visual. Therefore, for theLaplace fusion, the chosen base image would be fromthe visual, while for the wavelet and undecimated waveletfusion cases, the chosen base image would be the negativethermal image.

Table 1 presents estimated fusion quality indexes (ofXydeas and Petrovic17 and Wang and Bovic18) for the com-binations of input images presented in Figs. 8(a) to 8(c), forthe Laplace, wavelet, undecimated wavelet, and the arith-metic fusion methods, for different combinations of polarityand base images. The decision of the polarity determinationalgorithm is supported by the quantitative quality assessmentresults in Table 1, at which the images at the lower rowobtained higher-quality scores. In this case, it cannot bestated that there is a better base image to choose becausethe influence of the base image is visually not significant(unlike the significant influence of the polarity imagedecision). Also, the different indexes (shown in Table 1)give different results.

Figure 9 presents fusion results of visual and thermalimages (taken from Ref. 30) for different cases of polarityand base images. In this case, it is harder to decide whatthe better polarity state for the thermal image is. From thefusion results it can be seen that changing polarity wouldgenerate reconstruction artifacts on the roof of the leftbuilding in the scene. For the Laplace fusion, the method

produced TCIF of 0.79 for the visual image and 0.21for the thermal. For the wavelet fusion, it produced TCIFof 0.62 for the visual image and 0.38 for the thermal.The undecimated wavelet produced a TCIF of 0.57 forthe visual image and 0.43 for the thermal image. Hence,all the fusion methods chose the visual image as the baseimage.

Table 2 shows the estimated fusion quality indexes for theinput images in Figs. 9(a) to 9(c), for the Laplace, wavelet,and arithmetic fusion methods, for different combinations ofpolarity and base images. The decision of the polarity deter-mination algorithm is also supported by the quantitativequality assessment results in Table 2, at which the imagesat the middle row obtained higher-quality scores. In thiscase, it cannot be stated that there is clearly a better baseimage to choose because the influence of the base imageis visually not significant (unlike the significant influenceof the polarity image decision). Also, the different indexes(shown in Table 1) give different results.

Figure 10 presents fusion results of visual and thermalimages (taken from Ref. 28) for different cases of polarityand base images. In this case, the thermal image is moreinformative, and the polarity state algorithm decides notto change the polarity of the thermal image. For theLaplace fusion, the base selection method produced TCIFof 0.47 for the visual image and 0.53 for the thermal. Forthe wavelet fusion, it produced TCIF of 0.48 for the visualimage and 0.52 for the thermal. The decimated wavelet trans-form produced TCIF of 0.42 for the visual image and 0.58for the thermal. Therefore, the thermal image base image waschosen by all tested fusion methods. Table 3 presents theestimated fusion quality indexes for the input images inFigs. 10(a) to 10(c), which mostly support the algorithmdecision.

It should be noted that since the numbers in the tables arereduced-reference quality measures, the results (bold num-bers) are not supposed to be always the best in the table(i.e., the highest numbers at their columns). Therefore, thequality measures employed in the tables do not alwaysagree with each other. However, the tables do show that,

Table 1 Indexes for estimated objective fusion quality [suggested by Xydeas and Petrovic (Ref. 17) and Wang and Bovic (Ref. 18)] for combi-nations of the input images presented in Figs. 8(a) to 8(c), for the Laplace, wavelet, undecimated wavelet, and the arithmetic fusion methods, fordifferent combinations of polarity and base images. The symbol neg(b) indicates the polarity reversed input image (b). A quality measure valuecloser to 1 indicates a better fused image according to the measure.

Input images Base image

Laplace fusion Wavelet fusionUndecimated wavelet

fusion Arithmetic fusion

Xydeas andPetrovic

Wang andBovic

Xydeasand

PetrovicWang and

Bovic

Xydeasand

PetrovicWang and

Bovic

Xydeasand

PetrovicWang and

Bovic

(a) and (b) (a) 0.543 0.415 0.474 0.465 0.548 0.373 0.340 0.507

(a) and (b) (b) 0.540 0.481 0.447 0476 0.510 0.444

(a) and (b) max[(a),(b)] 0.546 0.408 0.475 0.469 0.559 0.451

(a) and neg(b) a 0.583 0.584 0.506 0.622 0.561 0.515 0.440 0.683

(a) and neg(b) neg(b) 0.602 0.484 0.540 0.676 0.608 0.512

(a) and neg(b) max[(a),neg(b)] 0.594 0.427 0.522 0.662 0.608 0.622

Journal of Electronic Imaging 023021-9 Mar–Apr 2014 • Vol. 23(2)

Ben-Shoshan and Yitzhaky: Improvements of image fusion methods

in general, there is an agreement of both fusion quality mea-sures with selections of the proposed method. Our results arealways among the higher numbers of both measures (i.e., inabout two thirds of the cases, the selections coincide with thehighest number in the column, and in 100% of them, theycoincide with the highest half).

In Table 3, third column of results, we can see that accord-ing to the “Xydeas and Petrovic” measure, there is no sig-nificant difference in four out of the six combinationsexamined. On the other hand, according to the “Wang andBovic” measure for the same case, there is a significant dif-ference between the upper and lower parts in the table (with/without reverse polarity). This means that in this case, Wangand Bovic’s measure agree with the selection of the proposedmethod. This is supported by the visual results, as it can beseen that the result in Fig. 10(e) (the selection of the pro-posed method) looks more informative than the result inFig. 10(h). It can also be seen in the table that in the exami-nation of different base images, there are some similar pairs

of results, i.e., cases where a selection of a certain base imageis very similar to the use of the fusion rule also with the baseimages. This occurs when the fusion rule, max[(a),(b)], pro-duces a base image similar to one of the original base imagesbecause it is brighter than the other base image at almost allpixels.

6 ConclusionsIn this paper, we presented two techniques to enhance imagefusion results in cases where the input images (obtained fromdifferent modalities) have certain properties that may causeartifacts, such as brightness irregularities and destructivesuperposition. Such artifacts may occur when the inputimages have a reverse polarity (bright areas in one imagecorrespond to dark ones in the other), or when in a multire-solution representation, the lowest-frequency band (baseimage) at one of the input images is less visually informative.The proposed techniques aim to rationally select the polarityof one of the images (the nonvisual image) and the base

Table 2 Indexes for estimated objective fusion quality [suggested by Xydeas and Petrovic (Ref. 17) and Wang and Bovic (Ref. 18)] for combi-nations of the input images presented in Figs. 9(a) to 9(c), for the Laplace, wavelet, undecimated wavelet, and the arithmetic fusion methods, fordifferent combinations of polarity and base images. The symbol neg(b) indicates the polarity reversed input image (b). A quality measure valuecloser to 1 indicates a better fused image according to the measure.

Input imagesBaseimage

Laplace fusion Wavelet fusionUndecimatedwavelet fusion Arithmetic fuion

Xydeas andPetrovic

Wang andBovic

Xydeas andPetrovic

Wang andBovic

Xydeas andPetrovic

Wang andBovic

Xydeas andPetrovic

Wang andBovic

(a) and (b) a 0.568 0.493 0.498 0.596 0.573 0.463 0.474 0.743

(a) and (b) b 0.544 0.555 0.446 0.573 0.515 0.536

(a) and (b) max[(a),(b)] 0.569 0.493 0.507 0.606 0.586 0.542

(a) and neg(b) a 0.493 0.416 0.411 0.414 0.495 0.387 0.348 0.528

(a) and neg(b) neg(b) 0.471 0.241 0.407 0.408 0.484 0.205

(a) and neg(b) max[(a),neg(b)] 0.470 0.240 0.406 0.409 0.534 0.344

Table 3 Indexes for estimated objective fusion quality [suggested by Xydeas and Petrovic (Ref. 17) and Wang and Bovic (Ref. 18)] for combi-nations of the input images presented in Figs. 10(a) to 10(c), for the Laplace, wavelet, undecimated wavelet, and the arithmetic fusion methods, fordifferent combinations of polarity and base images. The symbol neg(b) indicates the polarity reversed input image (b). A quality measure valuecloser to 1 indicates a better fused image according to the measure.

Input imagesBaseimage

Laplace fusion Wavelet fusionUndecimated wavelet

fusion Arithmetic fusion

Xydeas andPetrovic

Wang andBovic

Xydeas andPetrovic

Wang andBovic

Xydeas andPetrovic

Wang andBovic

Xydeas andPetrovic

Wang andBovic

(a) and (b) (a) 0.609 0.490 0.456 0.433 0.382 0.476 0.398 0.7

(a) and (b) (b) 0.670 0.6233 0.6108 0.616 0.517 0.595

(a) and (b) max[(a),(b)] 0.670 0.6235 0.6102 0.616 0.475 0.571

(a) and neg(b) a 0.579 0.304 0.420 0.266 0.363 0.367 0.356 0.584

(a) and neg(b) neg(b) 0.675 0.568 0.6102 0.567 0.515 0.508

(a) and neg(b) max[(a),neg(b)] 0.675 0.568 0.6106 0.567 0.467 0.492

Journal of Electronic Imaging 023021-10 Mar–Apr 2014 • Vol. 23(2)

Ben-Shoshan and Yitzhaky: Improvements of image fusion methods

image in multiresolution fusion. Several multiresolutionfusion methods have been tested with our polarity andbase image decision algorithms. The experimental resultsshow clearly that the new polarity decision approachhelps to improve fusion quality not only by subjective evalu-ation but also according to objective fused image qualityexisting measures. On the other hand, the base image selec-tion method, in our data set of real images, did not generatesignificant changes in the sense of objective measures, but itdid affect the illumination (lowest-frequency content) of thefused image. This property may have importance in subjec-tive viewing preferences. Therefore, the base image may alsobe subjectively selected. For example, when one of theimages is from the visual range, the observer may preferthe visual base image as it represents the regular viewingexperience. Such a visually subjective effect cannot be quan-tified by the proposed method or by the fusion quality evalu-ation methods. Note that since a polarity change makes theinput images more similar, a following selection of a baseimage has less impact on the resulting fused image. Thefusion enhancement techniques proposed here are easy toimplement and have low computational costs.

The improvements constitute a general tool and, thus,may be applicable for various modality inputs and for differ-ent fusion methods. We also examined the proposed methodwith two medical images (versions of magnetic resonanceimage types) and the results are shown to improve the visualappearance of the fused image (mainly the change of polar-ity), which resemble the results shown here for differentmodalities. Future directions can include implementationand adaptation of the proposed method to different typesof modalities that are very common in medical imaging.For example, the mask (Fig. 3) might be modified in certaincases or the selection of the base image may rely on differentconsiderations according to some a priori knowledge.Additionally, the effects of preprocessing (such as imageenhancement or restoration) on the operation of the proposedmethod can be examined.

References

1. A. Toet and J. Walraven, “New false colour mapping for image fusion,”Opt. Eng. 35(3), 650–658 (1996).

2. G. K. Matsopoulos, S. Marshall, and J. Brunt, “Multiresolutionmorphological fusion of MR and CT images of the human brain,”IEE Proc. Vis. Image Signal Process. 141(3), 137–142 (1994).

3. M. Aguilar and A. L. Garrett, “Biologically based sensor fusion formedical imaging,” Proc. SPIE 4385, 149–158 (2001).

4. Y. Zheng, A. S. Elmaghraby, and H. Frigui, “Three-band MRI imagefusion utilizing the wavelet-based method optimized with two quanti-tative fusion metrics,” Proc. SPIE 6144, 61440R (2006).

5. F. Sadjadi, “Comparative image fusion analysis,” in Proc. of the 2005IEEE Computer Society Conf. on Computer Vision and PatternRecognition, pp. 8–15, IEEE Computer Society, San Diego, CA (2005).

6. C. Osichinaka, J. Huamin, and C. Paul, “Investigation of image fusionmethods for helicopter day, night and all weather operation,” in IEEEIntelligent Vehicles Symp., pp. 1167–1172, IEEE, Xi'an, China (2009).

7. D. Ryan and R. Tinkler, “Night pilotage assessment with applications toimage fusion,” Proc. SPIE 2465, 50–67 (1995).

8. W. Dou et al., “A general framework for component substitution imagefusion: an implementation using the fast image fusion method,”Comput. Geosci. 33(2), 219–228 (2007).

9. S. A. F. Rodriguez et al., “An embedded multi-modal system for objectlocalization and tracking,” IEEE Intell. Transport. Syst. Mag. 4(4),42–53 (2012).

10. K. Yamamoto and K. Yamada, “Image processing and fusion to detectnavigation obstacles,” Proc. SPIE 3374, 337–346 (1998).

11. Y. Ben-Shoshan and Y. Yitzhaky, “Enhancement of image fusion meth-ods,” Proc. SPIE 8856, 885604 (2013).

12. V. Tsagaris and V. Anastassopoulos, “Fusion of visible and infraredimagery for night color vision,” Displays 26(4–5), 191–196 (2005).

13. H.-Y. Chen, C.-L. Huang, and C. M. Fu “Hybrid-boost learning formulti-pose face detection and facial expression recognition,” PatternRecognit. 41(3), 1173–1185 (2008).

14. A. Toet et al., “Fusion of visible and thermal imagery improves situa-tional awareness,” Displays 18(2), 85–95 (1997).

15. Z. Wang et al., “Image quality assessment: from error visibility tostructural similarity,” IEEE Trans. Image Process. 13(4), 600–612(2004).

16. M. Bagher and A. Aghagolzadeh, “A non-reference image fusion metricbased on mutual information of image features,” Comput. Electr. Eng.37(5), 744–756 (2011).

17. C. Xydeas and V. Petrovic, “Objective pixel-level image fusion perfor-mance measure,” IEE Electr. Lett. 36(4), 308–309 (2000).

18. Z. Wang and A. C. Bovik, “A universal image quality index,” IEEESignal Process. Lett. 9(3), 81–84 (2002).

19. G. Corsini et al., “Enhancement of sight effectiveness by dual infraredsystem: evaluation of image fusion strategies,” in IntelligentComputation Technology and Automation (ICTA), Thessaloniki,Greece, pp. 376–381 (2005).

20. C. Pohl and J. L. Van Genderen, “Multisensor image fusion in remotesensing: concepts, methods and applications,” Int. J. Remote Sens.19(5), 823–854 (1998).

21. E. Lallier and M. Farooq, “A real time pixel-level based image fusionvia adaptive weight averaging,” in Proc. of the Third Int. Conf. onInformation Fusion, Paris, France, Vol. 2, pp. WEC3/3–WEC313(2000).

22. P. Burt and E. Adelson, “The Laplacian pyramid as a compact imagecode,” IEEE Trans. Commun. 31(4), 532–540 (1983).

23. A. Toet, “Hierarchical image fusion,” Mach. Vis. Appl. 3(1), 1–11(1990).

24. S. C. Pei and F. C. Chen, “Hierarchical image representation by math-ematical morphology subband decomposition,” Pattern Recognit. Lett.16(2), 183–192 (1995).

25. H. Li, B. S. Manjunath, and S. K. Mitra, “Multisensor image fusionusing the wavelet transform,” Graph. Models Image Process. 57(3),235–245 (1995).

26. A. Ellmauthaler, L. Carla, and A. B. Eduardo, “Multiscale image fusionusing the undecimated wavelet transform with spectral factorization andnonorthogonal filter banks,” IEEE Trans. Image Process. 22(3), 1005–1017 (2013).

27. L. J. Chipman, T. M. Orr, and L. N. Graham, “Wavelets and imagefusion,” Proc. SPIE 2569, 208–219 (1995).

28. M. Irani and P. Anandan, “Robust multi-sensor image alignment,” inIEEE Int. Conf. on Computer Vision, pp. 959–966, IEEE computersociety, Bombay, India (1998).

29. R. G. Driggers et al., “Impact of speckle on laser range-gated shortwaveinfrared imaging system target identification performance,” Opt. Eng.42(3), 738–746 (2003).

30. J. W. Davis and V. Sharma, “Background-subtraction using contour-based fusion of thermal and visible imagery,” Comput. Vis. ImageUnderst. 106(2–3), 162–182 (2007).

Yotam Ben-Shoshan received his BSc in electrical and computerengineering in 2009. He received his MSc in electro-optics engineer-ing in 2013. Both degrees are from the Ben-Gurion University ofNegev, Israel. His research is mainly in the field of image processing.

Yitzhak Yitzhaky received his BS, MS, and PhD degrees in electricaland computer engineering from BenGurion University, Israel, in 1993,1995, and 2000, respectively. From 2000 to 2002, he was a postdoc-toral research fellow at the Schepens Eye Research Institute, HarvardMedical School, Boston, Massachusetts. He is currently with theElectro-Optics Unit at Ben Gurion University. His research interestsare mainly in the fields of image processing, vision, and computervision.

Journal of Electronic Imaging 023021-11 Mar–Apr 2014 • Vol. 23(2)

Ben-Shoshan and Yitzhaky: Improvements of image fusion methods