6
Principal Component Analysis Based Interconversion Between Infrared and Near-Infrared Spectra for the Study of Thermal-Induced Weak Interaction Changes of Poly(N-Isopropylacrylamide) LIPING ZHANG, ISAO NODA, and YUQING WU* State Key Lab for Supramolecular Structure and Material, Jilin University, No. 2699, Qianjin Street, Changchun, 130012 P. R. China (L.Z., Y.W.); The Procter & Gamble Company, 8611 Beckett Road, West Chester, Ohio 45069 (I.N.); and Jilin Business and Technology College, No. 4728, Xi’an Road, Changchun, 130061, P. R. China (L.Z.) The use of a novel spectral interconversion scheme, principal component analysis (PCA) based spectral prediction, to probe weak molecular interactions of a polymer film is reported. A PCA model is built based on a joint data matrix by concatenating two related spectral data matrices (such as infrared (IR) and near-infrared (NIR) spectra) along the variable direction, then the obtained loading matrix of the model is split into two parts to predict the desired spectra. For a better PCA-based prediction, it is suggested that the samples whose spectra are to be predicted should be as similar as possible to those used in the model. Based on the PCA model, the thermal-induced changes in the weak interaction of poly(N-isopropylacrylamide) (PNiPA) film is revealed by the interconversion between selected spectral ranges measured between 40 and 220 8C. The thermal-induced weak interaction changes of PNiPA, expressed as either the band shift or intensity changes at a specific region, have been probed properly. Meanwhile, the robustness of the spectral prediction is also compared with that achieved by a partial least squares (PLS2) model in detail, illustrating its advantages in predicting more subtle structural changes such as C–H groups. Index Headings: Multivariate estimation; Principal component analysis; PCA; Mid-infrared spectra; Near-infrared spectra; Prediction of weak interactions; PNiPA film. INTRODUCTION Multivariate methods are often necessary for the analysis, calibration, and band assignments of near-infrared (NIR) spectra. Several approaches, including partial least squares (PLS), 1,2 principal component regression (PCR), 3 and principal component analysis (PCA), 4–8 have been tried with varying degrees of success. The benefit of multivariate analysis is the replacement of laborious and costly measurements with much less expensive and more straightforward calibrated instrumen- tal measurements. By using multivariate methods, it is also possible to predict the absorbance of a certain spectral range (rather than the conventional metrics such as concentrations) from the other spectral range. For example, based on the PLS2 regression, we reported an application of PLS2 regression to such interconversion of spectral data. 9 The thermal-induced changes in the weak interaction of poly(N-isopropylacryla- mide) (PNiPA) film was perfectly predicted by the intercon- version between mid-infrared (IR) and NIR spectra measured at temperatures between 40 and 220 8C. It was demonstrated that not only the prediction of NIR spectra from IR spectra but also the much more practically important prediction of well- resolved IR spectra from easier to measure NIR spectra of PNiPA film can be achieved based on the proposed scheme. In this report, we describe the application of a PCA-based model to the interconversion of spectral data. In order to show the robustness of the new method on the spectral prediction, the same spectral dataset used in the previous study, i.e., the temperature-dependent IR and NIR spectra of PNiPA film, 9 were employed again, and the predicted results were compared with those achieved previously based on PLS2 regression. When we initially considered the interconversion of NIR and IR spectra, we defined the problem as a form of Procrustes analysis (matrix-to-matrix regression). The most efficient way to carry out this operation seems to be the use of the PLS2 algorithm, where one set of spectra (e.g., IR) is used as the X variables and the other set of spectra (in our case NIR) is used as the Y variables in a manner similar to concentrations. The underlying principle of the PLS algorithm (both PLS1 and PLS2) is that Y variables share the same scores with the pertinent components of the X variables. This point is most obvious in the PLS2 algorithm, where the extractions of the scores are carried out iteratively from both X and Y variables. This concept of common scores can also be explored in a different way using the PCA platform instead of PLS regression. In the case of the NIR–IR interconversion, we start with the data matrices A NIR and A IR for sets of NIR and IR spectra. We can combine (concatenate) the two matrices together to produce a joint matrix A NIR-IR (obtained by combining NIR and IR absorbance values into the same row). We then apply a factor analysis as PCA to this new joint matrix to obtain the following equations: 10 A NIR-IR ¼ ˆ A NIR-IR þ E NIR-IR ð1Þ ˆ A NIR-IR ¼ T NIR-IR P T NIR-IR ð2Þ where ˆ A NIR-IR is the portion of combined data modeled by PCA, E NIR-IR is the portion of data not modeled by PCA, and T NIR-IR and P NIR-IR , respectively, are the PCA scores and loadings of A NIR-IR . Both T NIR-IR and P NIR-IR comprise a set of orthogonal vectors. We then split the column of P NIR-IR loadings into two portions, P NIR and P IR , based on the selected spectral ranges for NIR and IR spectra. It is important to remember that the vectors in P NIR and P IR are no longer orthogonal to each other. They are different from the loadings obtained individually by applying the PCA to A NIR and A IR . Likewise, T NIR-IR is different from the scores obtained individually from the PCA Received 20 January 2009; accepted 30 March 2009. * Author to whom correspondence should be sent. E-mail: [email protected]. cn. 694 Volume 63, Number 6, 2009 APPLIED SPECTROSCOPY 0003-7028/09/6306-0694$2.00/0 Ó 2009 Society for Applied Spectroscopy

Principal Component Analysis Based Interconversion Between Infrared and Near-Infrared Spectra for the Study of Thermal-Induced Weak Interaction Changes of Poly(N-Isopropylacrylamide)

  • Upload
    yuqing

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Principal Component Analysis Based Interconversion BetweenInfrared and Near-Infrared Spectra for the Study ofThermal-Induced Weak Interaction Changes ofPoly(N-Isopropylacrylamide)

LIPING ZHANG, ISAO NODA, and YUQING WU*State Key Lab for Supramolecular Structure and Material, Jilin University, No. 2699, Qianjin Street, Changchun, 130012 P. R. China (L.Z.,

Y.W.); The Procter & Gamble Company, 8611 Beckett Road, West Chester, Ohio 45069 (I.N.); and Jilin Business and Technology College, No.4728, Xi’an Road, Changchun, 130061, P. R. China (L.Z.)

The use of a novel spectral interconversion scheme, principal component

analysis (PCA) based spectral prediction, to probe weak molecular

interactions of a polymer film is reported. A PCA model is built based on

a joint data matrix by concatenating two related spectral data matrices

(such as infrared (IR) and near-infrared (NIR) spectra) along the

variable direction, then the obtained loading matrix of the model is split

into two parts to predict the desired spectra. For a better PCA-based

prediction, it is suggested that the samples whose spectra are to be

predicted should be as similar as possible to those used in the model.

Based on the PCA model, the thermal-induced changes in the weak

interaction of poly(N-isopropylacrylamide) (PNiPA) film is revealed by

the interconversion between selected spectral ranges measured between

40 and 220 8C. The thermal-induced weak interaction changes of PNiPA,

expressed as either the band shift or intensity changes at a specific region,

have been probed properly. Meanwhile, the robustness of the spectral

prediction is also compared with that achieved by a partial least squares

(PLS2) model in detail, illustrating its advantages in predicting more

subtle structural changes such as C–H groups.

Index Headings: Multivariate estimation; Principal component analysis;

PCA; Mid-infrared spectra; Near-infrared spectra; Prediction of weak

interactions; PNiPA film.

INTRODUCTION

Multivariate methods are often necessary for the analysis,calibration, and band assignments of near-infrared (NIR)spectra. Several approaches, including partial least squares(PLS),1,2 principal component regression (PCR),3 and principalcomponent analysis (PCA),4–8 have been tried with varyingdegrees of success. The benefit of multivariate analysis is thereplacement of laborious and costly measurements with muchless expensive and more straightforward calibrated instrumen-tal measurements. By using multivariate methods, it is alsopossible to predict the absorbance of a certain spectral range(rather than the conventional metrics such as concentrations)from the other spectral range. For example, based on the PLS2regression, we reported an application of PLS2 regression tosuch interconversion of spectral data.9 The thermal-inducedchanges in the weak interaction of poly(N-isopropylacryla-mide) (PNiPA) film was perfectly predicted by the intercon-version between mid-infrared (IR) and NIR spectra measured attemperatures between 40 and 220 8C. It was demonstrated thatnot only the prediction of NIR spectra from IR spectra but alsothe much more practically important prediction of well-

resolved IR spectra from easier to measure NIR spectra ofPNiPA film can be achieved based on the proposed scheme. Inthis report, we describe the application of a PCA-based modelto the interconversion of spectral data. In order to show therobustness of the new method on the spectral prediction, thesame spectral dataset used in the previous study, i.e., thetemperature-dependent IR and NIR spectra of PNiPA film,9

were employed again, and the predicted results were comparedwith those achieved previously based on PLS2 regression.

When we initially considered the interconversion of NIR andIR spectra, we defined the problem as a form of Procrustesanalysis (matrix-to-matrix regression). The most efficient wayto carry out this operation seems to be the use of the PLS2algorithm, where one set of spectra (e.g., IR) is used as the Xvariables and the other set of spectra (in our case NIR) is usedas the Y variables in a manner similar to concentrations. Theunderlying principle of the PLS algorithm (both PLS1 andPLS2) is that Y variables share the same scores with thepertinent components of the X variables. This point is mostobvious in the PLS2 algorithm, where the extractions of thescores are carried out iteratively from both X and Y variables.

This concept of common scores can also be explored in adifferent way using the PCA platform instead of PLSregression. In the case of the NIR–IR interconversion, westart with the data matrices ANIR and AIR for sets of NIR and IRspectra. We can combine (concatenate) the two matricestogether to produce a joint matrix ANIR-IR (obtained bycombining NIR and IR absorbance values into the samerow). We then apply a factor analysis as PCA to this new jointmatrix to obtain the following equations:10

ANIR-IR ¼ ANIR-IR þ ENIR-IR ð1Þ

ANIR-IR ¼ TNIR-IRPTNIR-IR ð2Þ

where ANIR-IR is the portion of combined data modeled byPCA, ENIR-IR is the portion of data not modeled by PCA, andTNIR-IR and PNIR-IR, respectively, are the PCA scores andloadings of ANIR-IR. Both TNIR-IR and PNIR-IR comprise a set oforthogonal vectors.

We then split the column of PNIR-IR loadings into twoportions, PNIR and PIR, based on the selected spectral rangesfor NIR and IR spectra. It is important to remember that thevectors in PNIR and PIR are no longer orthogonal to each other.They are different from the loadings obtained individually byapplying the PCA to ANIR and AIR. Likewise, TNIR-IR isdifferent from the scores obtained individually from the PCA

Received 20 January 2009; accepted 30 March 2009.* Author to whom correspondence should be sent. E-mail: [email protected].

694 Volume 63, Number 6, 2009 APPLIED SPECTROSCOPY0003-7028/09/6306-0694$2.00/0

� 2009 Society for Applied Spectroscopy

of ANIR and AIR, since it is obtained under the condition ofmaximizing the variance of the entire combined data ANIR-IR.

One can quickly see that it should be possible to reconstructthe original spectral data sets ANIR and AIR from the splitloadings PNIR and PIR and the common scores TNIR-IR by thefollowing relationship:

ANIR ¼ TNIR-IRPTNIR ð3Þ

AIR ¼ TNIR-IRPTIR ð4Þ

In addition, one can further conclude that

ANIRPNIRðPTNIRPNIRÞ�1 ¼ TNIR-IR ¼ AIRPIRðPT

IRPIRÞ�1 ð5Þ

Finally, the interconversion validation can be achieved by thefollowing equations:

ANIR ¼ AIRPIRðPTIRPIRÞ�1PT

NIR þ ENIR ð6Þ

AIR ¼ ANIRPNIRðPTNIRPNIRÞ�1PT

IR þ EIR ð7Þ

Once we establish the above relationship, it is possible topredict IR or NIR spectra from each other based on the PCAmodeling. The advantage of this proposed PCA-basedtechnique compared to that based on PLS2 regression will bediscussed.

To carry out the PCA-based spectral interconversiontechnique, we need to apply the bilinear decomposition (i.e.,PCA) only once to obtain the joint scores and loadings, TNIR-IR

and PNIR-IR. For any given single spectrum aobserved spanning aportion of the spectral region covered by the joint data matrixANIR-IR, one can estimate the rest of the spectrum aunknown. Bysimply splitting the joint loadings PNIR-IR into Pobserved andPunknown according to the spectral region, the estimation of theunknown spectrum can be simply achieved by the followingequation:

aunknown ¼ aobserverdPobservedðPTobservedPobservedÞ�1 PT

unknown ð8Þ

It is important to point out that the observed spectrum doesnot have to be strictly limited to a fixed range in an NIR or IRspectrum. It can be a mixture of IR or NIR spectra, or even anarbitrary spectrum with a portion missing or too noisy to beuseful. The technique thus can be used as a convenient way tointerpolate and repair damaged data by replacing noisysegments, contaminations, and outliers.

EXPERIMENTAL

Sample Preparation. Poly(N-isopropylacylamide) (PNiPA)was obtained by the free radical polymerization of monomersin tert-amyl alcohol.11 PNiPA was dissolved in chloroform toform a 20.0 wt% solution before being placed onto a piece of amicroscope slide/KBr window and then dried under vacuum atroom temperature for 24 h. The resulting film was used in thesubsequent NIR and IR interconversion experiments.

Infrared and Near-Infrared Spectral Measurements. TheNIR and IR spectra were recorded by using a Nicolet Nexus470 FT-NIR/IR spectrometer with a liquid nitrogen cooledMCT detector or DTGS detector. The co-addition of 128 and64 scans was performed for both the NIR and IR spectra with aspectral resolution of 4 cm�1. Temperature-dependent spectra

were collected between 40 and 220 8C with an increment of 206 0.1 8C.11

Data Analysis and Principal Component AnalysisModeling. In order to correct the baseline shift of the NIRspectra, full multiple scattering correction (MSC) was em-ployed directly on the spectral data. In addition, the data werepretreated by mean-centering and auto-scaling (the latter isdone to eliminate the biased effect of signal size between the IRand NIR regions) before PCA. The PCA modeling of multipleIR and NIR absorbance spectra was performed by Unscrambler7.01 by using cross-validation for model calibration andprediction. Additional spectra, which were not used in buildingthe PCA model, were employed for further prediction.

RESULTS AND DISCUSSION

Temperature-Dependent Infrared and Near-InfraredSpectra of the PNiPA Film. The original IR and NIR spectraof the PNiPA film with increasing temperature were shown inFigs. 1, 3, and 4 in the previous paper.9 Decrease in theintensity of a strong band at approximately 1650 cm�1

(assigned to the intramolecular hydrogen-bonded amide groupof PNiPA)12–15 accompanied a gradual increase of free NHgroups at about 3434 cm�1 in the IR range, which indicatedthat the intramolecular hydrogen bonds of the amide groupswere weakened by heating. Similar changes with temperaturecould also be observed for the bands at approximately 6500cm�1 assigned to the first overtone of hydrogen-bonded NHgroups, bands at 4500–5000 cm�1 related to the combinationmodes of amide groups (both decreased with temperature), andfor the band at approximately 6736 cm�1 due to the firstovertone of free NH groups (increased with temperature) inNIR spectra. It was substantiated that temperature-dependentchanges in the band shifts and intensities of PNiPA certainlyoccurred both in the IR and NIR regions, especially for the NHgroups (see Figs. 2a and 2b in Ref. 9).

Building and Validation of Principal Component Anal-ysis Model. As discussed above, to perform the spectralprediction based on PCA, it is necessary to firstly concatenatetwo spectral data matrices such as one from IR spectra and

FIG. 1. Validation result of Model I in (a) the NIR and (b) IR range; Ai andAi

* denote the predicted and measured absorbance values; (�), (m), ($), (*),and (fl) denote, respectively, spectra measured at 40, 80, 120, 160, and 200 8C.

APPLIED SPECTROSCOPY 695

another from NIR spectra along the variable (row) direction.PCA is then applied to the new combined matrix to obtain thescore and loading matrices TNIR-IR and PNIR-IR. The jointloadings PNIR-IR is then split into two parts PNIR and PIR.Finally, the predicted IR or NIR spectra are calculatedaccording to Eq. 8. To compare the spectral predictionefficiency of the PCA method with that of PLS2, we selectthe same spectral data in the present study as those used forPLS2 estimation in our previous work.9 That is, the first model(named Model I) for interconversion is built based on twospectral ranges between 3030–3460 cm�1 (224 variables) and6450–6825 cm�1 (195 variables) and the samples composed of

spectra measured at 40, 80, 120, 160, and 200 8C, respectively.The reason that we choose these two spectral regions in bothpapers is because of the significant changing tendency of themNH band there.

Validation is conducted firstly to confirm the predictiveability of the model built by PCA and to determine how wellthe model will perform on the data. When the PCA model isapplied to predict the same samples constructing the modelitself, the discrepancy between the estimated spectra and actualdata can be used as the measurement for the robustness of themodel. Figure 1 shows the validation result of Model I. It isobvious that the relative deviations for both IR and NIR spectraare rather low. Of note is that the deviations for the IR spectra(with a maximum of approximately 0.25% and RMSE of 3.013 10�9) are somewhat larger than those for the NIR (with amaximum of only approximately 0.06% and RMSE of 1.50 310�10). In other words, the reconstructed NIR spectra based onModel I is much closer to the original ones than the recon-structed IR spectra. The reason for such results is unclear butwe speculate that the model may be predominately influencedby the smaller intensity variations in the NIR region. It is clearfrom Fig. 4 in Ref. 9 that the bands of free NH (6736 cm�1) aremuch stronger than those of the associated NH (3450 cm�1),and the thermal-induced spectral changes are mainly dominatedby the increase of free NH bands. The good linearity of theintensity variation results in less deviation of the predicted NIRspectra. Moreover, it is noted that when the model is applied tothe genuine prediction (i.e., samples to be predicted are notincluded during the construction of the model), the discrepancybetween the estimated and measured spectra is also muchlarger for the IR region than that for the NIR region.

Interconversion of Near-Infrared and Infrared Spectraby Using the Principal Component Analysis Model. N–HGroups. Model I is applied to perform spectral predictionbetween the IR (3030–3460 cm�1) and NIR (6450–6825 cm�1)regions concerning N–H groups. Samples to be predicted arethose at 60, 100, 140, 180, and 220 8C that are not included

FIG. 2. (a) The predicted NIR spectra at 60, 100, 140, 180, and 220 8C basedon Model I. Predicted results are shown as dashed lines (for 60, 100, 140, and180 8C) and dotted lines (for 220 8C), and measured spectra are shown as solidlines; (b) the relative deviations with Ai and Ai

* denoting the predicted andmeasured absorbance values respectively; (�), (4), ($), (*), and (fl)indicate, respectively, samples at 60, 100, 140, 180, and 220 8C.

FIG. 3. (a) The predicted NIR spectra at 180 8C based on Model I and II;predicted results are shown as dotted lines (Model I) or dashed lines (Model II),and measured spectra are shown as solid lines; (b) the relative deviations withAi and Ai

* denoting the predicted and measured absorbance valuesrespectively; (*) and (�) indicate Model I and II, respectively.

FIG. 4. (a) The predicted IR spectra at 60, 100, 140, 180, and 220 8C based onModel I. Predicted spectra are shown as dashed lines and measured ones assolid lines; (b) the relative deviations from PCA prediction with Ai and Ai

*

denoting the predicted and measured absorbance values, respectively; (c) therelative derivations from PLS2 prediction; (�), (m), ($), (*), and (fl)indicate, respectively, samples at 60, 100, 140, 180, and 220 8C.

696 Volume 63, Number 6, 2009

during the construction of the model. The predicted spectra arecalculated according to Eq. 8. It should be mentioned thatAobserved denotes data in the range of 6450–6825 cm�1, whileAunknown denotes data in the range of 3030–3460 cm�1 whenpredicting spectra in this IR range and vice versa.

Plotting the predicted data versus the measured referencevalues is a useful way to illustrate the validity of PCAprediction.9 Figure 2a shows the PCA-predicted NIR spectraand the corresponding experimental spectra measured by theinstrument at 60, 100, 140, 180, and 220 8C, respectively. It isclear that the corresponding two spectra at each temperature arevery close to each other with an absorption band atapproximately 6730 cm�1. Figure 2b shows the relativedeviation of the prediction to the measured absorbance. It isof note that the higher relative deviation values are mainlyfound in samples at 180 and 220 8C, which indicates thatModel I is more accurate for predicting samples at lowtemperatures. The cause of the apparent effect of temperaturemay originate from the larger changes in the overtone of freeNH (6736 cm�1) as compared to the associated one (6500cm�1), and the contrast changing tendency of the fundamentalNH band (3450 cm�1) as free NH groups can be better seen inthe NIR spectra than in the IR range.16 Such a deviation of thespectral changes from linearity results in samples of lowtemperatures (within 140 8C) and high temperatures (over 1408C) distributed in distinct areas in the PCA score plots, andtherefore it is not proper to include all five samples into onePCA model. In addition, the model is dominated by samples oflow temperatures (which occupied three-fifths of the total) andthus reveals their information more strongly.

In order to further investigate the influence of modelcomposition on its predicting ability and verify the aboveconclusion (especially the second one), we built another model(named Model II) scanning the same spectral range as that forModel I. It is composed of four samples measured at relativehigh temperatures of 140, 160, 200, and 220 8C, which are alldistributed in the same area in the PCA score plots. Such amodel is built to specially predict NIR spectra at 180 8C.

Figure 3a shows the spectra predicted by Model I and II,respectively, and Fig. 3b shows the relative deviations of theprediction. It is clear that the spectrum predicted by Model II iscloser to the measured one, and consequently the relativedeviations are much smaller. The result indicates that Model IIis more robust than Model I when predicting samples of hightemperature such as 180 8C. Not surprisingly, it is preferred toselect more samples of high temperatures to build a modelwhen predicting higher temperature samples and vice versa.

Similar to the case for predicting NIR spectra, Model I is alsoapplied to predict IR spectra at 60, 100, 140, 180, and 220 8Caccording to Eq. 8. Figure 4a shows the predicted results. Likethe case in the NIR region, it is apparent that the predicted IRspectra are also in close agreement with the measured ones,except for the sample at 220 8C. Figures 4b and 4c compare therelative deviations of the predicted results based on PCA andPLS2, respectively. The maximum relative deviation value isapproximately 14.7% and the root mean squared error (RMSE)is approximately 1.9 3 10�3 for the former case; while thecorresponding values for the latter case are approximately 3.9%and 1.5 3 10�4, respectively. For the current spectral range, theprediction accuracy of the PCA model is not as good as thatachieved by PLS2. However, the relative deviation values ofsamples at low temperatures are not so large (with the largest

relative deviation being 7.8%), indicating that such results arestill acceptable.

By comparing the RMSE values of the predicted IR spectra(1.9 3 10�3 as mentioned above) with that of the NIR spectra(which was approximately 4.9 3 10�5), it is obvious thatspectral prediction based on Model I is more accurate for theNIR range than for the IR range. This conclusion is inaccordance with the validation results of the model asdiscussed above.

C–H Groups. In order to further investigate the predictingability of the PCA method and to compare it with that of thePLS2 model, several other models concerning C–H groups,named Models III (built by joining spectra at 1300–1480 cm�1

and 2800–3018 cm�1), IV (built by joining spectra at 1300–1480 cm�1 and 5600–6030 cm�1), and V (built by joiningspectra at 1300–1480 cm�1 and 3020–3600 cm�1) are alsoconstructed by following those used in the PLS2 prediction.

Figure 5 shows the validation results of Model III withpredicted spectra in the range of 1300–1480 cm�1 (Fig. 5a) and2800–3020 cm�1 (Fig. 5c) and the corresponding relativedeviations (Figs. 5b and 5d, respectively). It is clear that thepredicted spectra are very close to the measured ones in the twoIR ranges. By comparing Figs. 5b and 5d, it is found that therelative deviations for the two IR regions are basicallyequivalent, especially around the main peaks. However, thecorresponding RMSE value of 5.04 3 10�9 for the range of1300–1480 cm�1 is somewhat smaller than that (1.25 3 10�8)for the range of 2800–3020 cm�1, indicating that the modelshould be relatively more efficient when predicting spectra inthe range of 1300–1480 cm�1.

Predicting Ability of the Principal Component AnalysisModel in Comparison with that of Partial Least Squares-2.

FIG. 5. The validation results of Model III with spectra at 40, 80, 120, 160,and 200 8C in (a) the range of 1300–1480 cm�1 and (c) the range of 2800–3020cm�1; predicted spectra are shown as dashed lines and measured ones as solidlines; (b) and (d) are the relative deviations in the two IR ranges with Ai andAi

* denoting the predicted and measured absorbance value, respectively; (�),(m), ($), (*), and (fl) indicate, respectively, samples at 40, 80, 120, 160, and200 8C.

APPLIED SPECTROSCOPY 697

To further estimate the efficiency of the spectral interconver-sion method based on PCA, further comparison is made withthe method based on PLS2. Firstly, the comparison is focusedon the predicting of spectra concerning the N–H groups. Asdemonstrated above in Figs. 4b and 4c, the relative deviationsof prediction by the PCA method (with a SE of approximately1.9 3 10�3) is higher than those of PLS2 regression (withRMSE of approximately 1.5 3 10�4) for the IR region (3030–3460 cm�1). The same conclusion is also achieved with that forthe NIR region (6450–6825 cm�1), where the RMSE values areapproximately 4.9 3 10�5 for the PCA model and approxi-mately 5.8 3 10�7 for the PLS2 model. It suggests that in thisrespect the PCA model is not as efficient as the PLS2 modelwhen performing the interconversion prediction of NIR and IRspectra concerning N–H groups. This conclusion is alsoconfirmed by the thermal-induced changes of band intensityand shifts concerning N–H groups both in the IR and NIRregions (as shown in Fig. 6), where the changing trendpredicted by PLS2 is closer to the measured one than thatachieved by PCA.

Secondly, the comparison with respect to the predictingability of the two methods is also extended to other infraredranges (e.g., C–H vibration regions as mentioned above).Figures 7a, 7b, and 7c show the predicted IR spectra in therange of 1300–1480 cm�1 for samples at 60, 100, 140, 180, and220 8C based on Model III, Model IV, and Model V,respectively. It is apparent that the spectra predicted by thesedifferent models are all similar to each other and also veryclose to the measured ones. The relative deviations of the threePCA models (shown in Fig. 7d) are very close to thoseachieved by using PLS2-based models (shown in Fig. 7e),indicating that their predicting ability is comparable for theestimation concerning C–H groups. The conclusion is alsosupported by the RMSE values of PCA prediction with 7.5 310�6 for Model III, 1.2 3 10�5 for Model IV, and 5.6 3 10�5 for

Model V. When comparing the RMSE values of PCA resultswith the corresponding ones of PLS2 (1.2 3 10�5, 4.0 3 10�5,and 1.4 3 10�5, respectively), it can be concluded that the twomethods are equally robust on the whole when predictingspectral region concerning C–H groups. Moreover, it is of notethat the RMSE values of Models III and IV (built between C–Hvibration regions) are even somewhat lower for the PCA modelthan those for the PLS2 model, while the case for Model V(built between C–H and N–H vibration regions) is obviouslycontrary. This suggests that the PCA method is slightly more

FIG. 6. Temperature-dependent (a) band intensity changes and shifts of hydrogen-bonded NH(m(NH)b) at 3309 cm�1; (b) band intensity changes of the first overtoneof hydrogen-bonded NH(2m(NH)b) at 6500 cm�1 and free NH (2m(NH)f) at 6736 cm�1 in the NIR spectral region; � with dotted lines represents the spectra predictedby PCA; $ with dashed lines represents the spectra predicted by PLS2; and 4 with solid lines represents the corresponding measured ones.

FIG. 7. The predicted IR spectra at 60, 100, 140, 180, and 220 8C based on (a)Model III, (b) Model IV, and (c) Model V, respectively; the relative deviationsfrom (d) PCA and (e) PLS2 prediction with Ai and Ai

* denoting the predictedand measured absorbance values, respectively; predicted spectra are shown asdashed lines and measured ones as solid lines; (�), (m), ($), (*), and (fl)indicate, respectively, samples at 60, 100, 140, 180, and 220 8C.

698 Volume 63, Number 6, 2009

accurate than PLS2 regression when predicting more subtlethermal-induced vibration changes of C–H groups. The goodprediction of the behavior of C–H groups in contrast to the N–H groups is probably related to the weaker changes for the C–Hgroups as compared to those of N–H groups, while PLS2 ismore efficient when predicting relatively significant changesconcerning N–H groups in PNiPA film. The latter suggestion isalso supported by Figs. 4b and 4c.

In conclusion, subtle differences exist between the twotechniques of spectral interconversion based on PLS2 regres-sion and PCA modeling. The cause may be that PLS2regression modeling separates the data into two camps(predicting versus predicted), while PCA utilizes the entiredata set (from IR and NIR) to build a global model. Spectralprediction is then carried out to maximize the consistency withthis model. Thus, the philosophical positioning might beslightly different between the PCA and PLS2 approaches.

CONCLUSION

The present study demonstrates a novel application of PCAmodeling for the spectral interconversion between arbitrarilyselected spectra. This scheme is properly used to probe thethermal-induced weak interaction changes in PNiPA film.Meanwhile, the method is also compared with that based onPLS2 regression. Several conclusions can be drawn: (1) thenew predicting method based on PCA can also be applied toestimate the thermal-induced weak-interaction changes ofPNiPA film by the interconversion between IR and NIRspectra; (2) the method is good at predicting the behavior of C–H groups in contrast to N–H groups, which is probably relatedto the weaker changes of the C–H groups as compared to thoseof N–H groups; (3) for a better PCA-based prediction, it issuggested that the samples whose spectra are to be predictedshould be as similar as possible to those used in the model; and(4) the PCA model is comparable to the PLS2 model with

respect to the prediction concerning more subtle structuralchanges such as C–H groups, while the latter one is preferredto estimate spectra with relatively significant changes such asN–H groups, at least for the present case of PNiPA film.Further effort to illustrate the potential of spectral estimationbased on PCA modeling will be addressed by using spectraldatasets concerning more complicated polymers or proteins insolution.

ACKNOWLEDGMENTS

The present study is supported by the Project of NSFC (No.20473028,20773051), the Major State Basic Research Development Program(2007CB808006), the Programs for New Century Excellent Talents inUniversity (NCET), and the 111 project (B06009), which are gratefullyacknowledged.

1. G. M. Escandar, P. C. Damiani, H. C. Goicoechea, and A. C. Olivieri,Microchem. J. 82, 29 (2006).

2. S. Sasic and Y. Ozaki, Anal. Chem. 73, 64 (2001).3. A. Urbas, M. W. Manning, A. Daugherty, L. A. Cassis, and R. A. Lodder,

Anal. Chem. 75, 3650 (2003).4. N. Navas, J. Romero-Pastor, E. Manzano, and C. Cardell, Anal. Chim.

Acta 630, 141 (2008).5. E. A. M. Brouwer, E. S. Kooij, H. Wormeester, M. A. Hempenius, and B.

Poelsema, J. Phys. Chem. B 108, 7748 (2004).6. T. Hasegawa, Anal. Chem. 71, 3085 (1999).7. K. M. Dokken and L. C. Davis, J. Agric. Food Chem. 55, 10517 (2007).8. Y. M. Jung, Bull. Korean Chem. Soc. 24, 1345 (2003).9. L. P. Zhang, Y. Q. Wu, and I. Noda, Appl. Spectrosc. 63, 112 (2009).

10. E. R. Malinowski, Factor Analysis in Chemistry (John Wiley and Sons,New York, 1991), 2nd ed.

11. B. Sun, Y. Lin, and P. Wu, Appl. Spectrosc. 61, 765 (2007).12. P. Wu and H. W. Siesler, J. Near Infrared Spectrosc. 7, 65 (1999).13. Y. Maeda, T. Higuchi, and I. Ikeda, Langmuir 16, 7503 (2000).14. J. Wang, M. G. Sowa, M. K. Ahmed, and H. H. Mantsch, J. Phys. Chem.

98, 4748 (1994).15. Y. Q. Wu, F. Meersman, and Y. Ozaki, Macromolecules 39, 1182 (2006).16. T. Di Paolo, C. Bourderon, and C. Sandorfy, Can J. Chem. 50, 3161

(1972).

APPLIED SPECTROSCOPY 699