ON SOME COMPUTATIONAL METHODS IN FACTOR ANALYSIS

This article was downloaded by: [University of York]On: 21 November 2014, At: 01:25Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Cybernetics and Systems: An International JournalPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/ucbs20

ON SOME COMPUTATIONAL METHODS IN FACTORANALYSISFILOMENA DE SANTIS aa University of SalernoPublished online: 27 Mar 2007.

To cite this article: FILOMENA DE SANTIS (1984) ON SOME COMPUTATIONAL METHODS IN FACTOR ANALYSIS, Cybernetics andSystems: An International Journal, 15:3-4, 247-257, DOI: 10.1080/01969728408927747

To link to this article: http://dx.doi.org/10.1080/01969728408927747

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in thepublications on our platform. However, Taylor & Francis, our agents, and our licensors make no representationsor warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Anyopinions and views expressed in this publication are the opinions and views of the authors, and are not theviews of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should beindependently verified with primary sources of information. Taylor and Francis shall not be liable for any losses,actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoevercaused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/ucbs20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/01969728408927747

http://dx.doi.org/10.1080/01969728408927747

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Cybernetics and Systems: An International Journal, 15:247-257, 1984

ON SOME COMPUTATIONAL METHODSIN FACTOR ANALYSIS

FILOMENA DE SANTIS

University of Salerno

The standard method for the estimate of factor scores uses least squareapproximation techniques. Such a method, while ensuring the minimumsquare error on the considered approximation, leads to a factor covariancematrix which is clearly not the same as the reduced one. Continuing previous studies in which the computation of factor scores was consideredwith regard to the best reproduction of correlations among variables, thispaper analyzes a procedure for computing factor scores in such a way thatthe reduced covariance matrix is exactly preserved, as well as the linearityof the approximation function.

INTRODUCTION

Although many research studies involve multivariate observations, sometimeslittle is known about the interrelations between variables, between cases, orbetween variables and cases so that it is often difficult to establish if and howdata are structured.

When the objective of multivariate data analysis is to find a relationshipthat just describes the interrelations between variables, it has been often experienced that factor analysis is well suited to the scope (rather thoroughpresentation of related techniques may be found in [7]).

The numerical implementation of a factor analysis algorithm consists offour main steps: first, the correlation matrix is computed; second, the factorloadings are estimated; third, the factors are rotated; and fourth, the factorscores are computed.

This paper deals with the fourth step: factor solution, indeed, is notuniquely determined; hence, the necessity of an estimate for it. Of course,

Copyright © 1984 by Hemisphere Publishing Corporation 247

Dow

nloa

ded

by [

Uni

vers

ity o

f Y

ork]

at 0

1:26

21

Nov

embe

r 20

14

248 F. DE SANTIS

the best approximation is in the least square sense, although it does notexactly reproduce the correlations to be approximated. However, it is possible to provide a different estimate (see [4]) that, while still remaining alinear combination of the original variables, also reproduces correlationsexactly.

The study of such an estimate is carried out along lines similar to thoseoutlined in [7]. Here the aim is to investigate the behavior of the solutionand to determine lower and upper bounds to the error produced by it as compared to the minimum error associated with the least square approximation.It will be shown that the error thus introduced is at most twice the error inthe least square case, a result that seems to have passed unnoticed; moreover,such upper bound corresponds to situations in which factor analysis is not actually meaningful or applicable at all.

-APPROXIMATION PROCEDURES

In the sequel we shall make reference to the notation of Della Riccia [2] andHarman [6], and set:

x = AF • DU

and

(1)

(2)

In terms of the new variables X' =(X; , X; , ... , X~), linearly dependent onF =[Fj , F2 , ••• , Fm),onehas:

X' :::I. AF

so that Eq. (2) becomes:

(3)

(4)

where R', the covariance matrix of variables X', has clearly rank m < nanddiffers from R only in the diagonal terms, namely in the communalities.

Let now X' in Eq. (3) be a function to be approximated in terms of Xby means of the linear combination

X I '= LX

Dow

nloa

ded

by [

Uni

vers

ity o

f Y

ork]

at 0

1:26

21

Nov

embe

r 20

14

COMPUTING FACTOR SCORES 249

where L is an unknown matrix to be determined in such a way that thesquare error et is minimum. Then it can be proved that

is minimized by:

-1L = R'R

so that

and

2 -1 -1e = IIX· - R' R X II = tr ( R' - R' R R • )

L

Correlations induced by Eq. (5) are:

which are clearly not the same as R'.Factors are finally computed from Eq. (3) as:

Let us now consider the linear estimate:

(5)

(6)

(6bis)

(7)

where V is an unknown matrix [4] . Correlations induced by Eq. (7) are:

which are equal to R' if and only if V is a unitary matrix.

Dow

nloa

ded

by [

Uni

vers

ity o

f Y

ork]

at 0

1:26

21

Nov

embe

r 20

14

250

Factors are computed from Eq. (3) as:

F. DE SANTIS

where A = Mill since, as stated by the factor analysis theory, A and Aarethe column matrices of the R' eigenvectors normalized to unity and corresponding eigenvalues, respectively.

The approximation square error introduced by using Eq. (7) is:

The unknown unitary matrix V is determined by minimizing et.It can be shown by standard techniques that the condition for minimum

f l .o eV IS:

(9)

where M is the symmetric n X n Lagrange multiplier matrix introduced tosolve the Eq. (8) minimization problem constrained to VVI = I.

Equation (9) corresponds to the polar decomposition of a linear operatorin a unitary space: every real matrix B can indeed be always decomposed intoa product of two factors, a real positive semidefinite symmetric matrix Mand a real unitary matrix V. The factor M is always uniquely determined fromEq. (9) by the right multiplication of MV by its transpose, whereas the unitary factor V is uniquely determined from Eq. (9) if and only if the matrix Bis nonsingular. However, in the case of singularity, the solution, although notunique, is given by:

k = 1 ••••• n

where tk is a complete orthonormal system of eigenvectors of the right normof the matrix Band zk is a complete orthonormal system of eigenvectors ofM [5]. From the above mentioned results it follows:

and from Eq. (8)

Dow

nloa

ded

by [

Uni

vers

ity o

f Y

ork]

at 0

1:26

21

Nov

embe

r 20

14

COMPUTING FACTOR SCORES 251

It should be pointed out that the equality Rx v = R', determining the relations in Eqs. (7) and (10), is equivalent to condition (26) in [7].

COMPARISON OF THE ESTIMATES

The estimates for X' considered in the previous sections provide differentkinds of advantages.

Essentially, the least square estimate makes sure that the error introduced in the final representation of the original data is minimum; instead,the estimate by means of the unitary matrix V makes sure that RXV = R',thus leading to a better reproduction of the original correlations according tothe main claim of factor analysis. In fact, recalling Eqs. (4) and (6bis) itfollows:

with an approximation error on R given by

whereas no further error is introduced when use of the estimate by means ofV is made as suggested by

e 2 = tr(R _ R,)2 _ 2tr(R _ ROlD 2 + trD 4 0R)(o

V

Moreover, when a vector representation is adopted for the input data set(namely, when a geometric representation of the input data does not takeplace by N points in the n-space of the variables, but by n points in theN-space of the individuals), the correlation coefficients between two variables, measured as deviates from the respective means, are the cosines of theangle between their vectors in the N-space. Thus, reproducing exactly R'implies that the vector representation of the m factor points in the N-spaceof the individuals will better retain the inherent structure of data by virtueof the improved reproduction of the original correlation matrix.

Finally, we remark that the computational complexity of both algorithms for the estimate of F is the same apart from constant terms. Themost time-consuming operations are, in fact, in both cases the eigenvalue-

Dow

nloa

ded

by [

Uni

vers

ity o

f Y

ork]

at 0

1:26

21

Nov

embe

r 20

14

252 F. DE SANTIS

eigenvector computation and the matrix multiplication. The first operationis usually accomplished by triangular factorization with pivoting: in the caseof symmetric n X n matrices with half-bandwidth k, such operation requiresa computational complexity which is in general o[1/2k(k + l)n] and O(n2

)

for correlation matrices. The estimate of F by means of the unitary matrixV requires a triple resort to the eigenvalue-vector computation subroutine,whereas the least square estimate only requires a single use of it, so that theensuing complexity polynomials will only differ by a constant term. Unfortunately, however, the matrix multiplication cannot be done more efflciently than O(n3

) : again, the number of such operations is larger in the caseof the estimate by V, yet giving rise to a difference in a constant term.

PERFORMANCE GUARANTEESFOR APPROXIMATION ERRORS

The above considerations indicate the necessity to evaluate the performanceof the estimate of F by means of V as compared to the performance of theleast square estimate.

The least square estimate leads to the best representation of X' in termsof the observable X characterized by the minimum error, whereas the estimate by V can be considered as a compromise between the "optimality" ofthe searched solution, and the presence of constraints to the problem. Thus,it is worth considering how close the compromise solution is to the optimalone.

In view of such a task, let us informally define as a solution value for anestimate algorithm, a function that assigns to each instance of the problema positive rational number that decreases as much as the solution improves.Then, taking as solution valuesthe mean square errors, it follows:

2tr(R' _ (R,3/2R-I R,3/2)1/2)

tr(R' _ R'R-1R')(II)

The aim is to find an upper bound to this ratio.Experimental results (Tables 1-3) have suggested that W .;;;; 2, the

equality holding if factor analysis is not actually meaningful or applicableat all.

Let us now try to provide an explanation to such results. First of all letus consider that W .;;;; 2 will hold if and only if

Dow

nloa

ded

by [

Uni

vers

ity o

f Y

ork]

at 0

1:26

21

Nov

embe

r 20

14

TA

BL

E1

Exp

erim

enta

lVal

ues

oftr

R'

ande~/eL

with

n=

10[n

=m

ax(t

rR')

=10

1

TA

BL

E2

Exp

erim

enta

lV

alue

sof

trR

'an

de\

,feL

with

n=

50[n

=m

ax(t

rR')

=50

1

trR

'2

2eV

/eL

0.1

00

00

0+

00

0.1

92

36

0..

-01

0.2

00

00

0..

00

0.1

8·9

99

0.0

1

0.3

00

00

0..

00

0.1

66

24

0+

01

0.4

00

00

0+

00

0.1

84

36

0+

01

0.5

00

00

0+

00

0.1

82

22

0...

01

0.6

00

00

0+

00

0.1

&1

00

0.0

1

O.7

00

00

D..

OO

0.1

79

87

0.0

1

0.8

00

00

0+

00

0.1

77

60

0+

01

0.9

00

00

0..0

00

.17

68

9D

+O

l

0.1

00

00

0.0

10

.17

60

00

.01

0.2

C0

00

D..

01

O.1

69

!J9

D+

OI

u.3

00

00

0..

01

0.1

61

79

0+

01

0.4

00

00

0.0

10

.15

&1

30

+0

1

0.6

00

00

0.0

10

.15

00

0D

.01

0.8

0':

'00

0.0

10

.14

47

20

.01

o.1

00

00

D+

02

0.1

39

99

0+

01

0.2

00

00

0.0

20

.12

52

50

..0

1

0.3

00

00

D+

02

0.1

13

00

0..

01

0..

40

00

0D

+0

20

.10

95

80

..0

1

0.5

00

00

0+

02

0.0

00

02

D_

01

0.1

82

00

0.0

1

0.1

75

91

D+

Ol

0.1

69

52

0.0

1

0.1

65

55

D+

01

0.1

64

31

0.0

1

0.1

60

00

0.0

1

0.1

59

67

0.0

1

0.1

56

81

0.0

1

0.1

54

01

0.0

1

O.1

!i2

43

D.0

1

0.1

39

21

0+

01

0.1

30

01

0.0

1

0.1

22

97

0.0

1

0.1

26

51

D+

01

0.1

17

78

0.0

1

0.1

10

56

0+

01

0.1

09

59

0.0

1

0.1

03

04

30

.01

0.0

00

11

0+

01

--

O.l

QO

OC

D.O

O

0.2

00

00

0.0

0

0.3

00

00

0...

00

0.4

00

00

0.0

0

0.5

00

00

0.0

0

0.6

00

00

0+

00

0.7

00

00

D+

OO

0.8

00

00

0.0

0

0.9

00

00

0+

00

0.1

00

00

0.0

1

0.2

00

00

0.0

1

0.3

00

00

0+

01

0.4

00

00

0.0

1

0.5

00

00

0.0

1

0.6

00

00

0+

01

0.7

00

00

0+

01

0.8

00

00

:J+

Ol

0.9

00

00

0+

01

0.1

00

00

0+

02

trR

'1

---------1

---"---''----

N Ul ...

Dow

nloa

ded

by [

Uni

vers

ity o

f Y

ork]

at 0

1:26

21

Nov

embe

r 20

14

254

TABLE 3 Experimental Values of trR'

and ev/eL with n "" 100[n = max(trR') = 100]

t.rR '2 :1

.,V/eL

0.100000.00 0.194120.01

O. :'00001.• 00 0.169620.01

0.50000,'.00 0.187561>.01

0*:'000Clr·+oo 0.185110001

O.<iOOc.::r.oOCJ 0.1133430.01

O.10000D.·Ol 0.181290001

0.3000r>0.01 0.172340·001

0.50:)000001 0.165010.01

0.70000,1+01 0.160030001

0.900000.01 0.156920.01

0.10000D.02 0.152000.01

0.200000.02 0.139190.01

0.300000.02 0.130220.01

:).400000.02 0.124710001'·

0.500000.02 0,117650.01

0.600000.02 0.112740.01

0.700000.02 0.109880001

0.800000002 0.106000.01

0.900000.02 0.102740.01

0.1000,10.03 0.000010.01

F. DE SANTIS

(12)

Thus the problem is to prove that Eq, (12) is always satisfied when positivesemidefinite matrices are taken into account. Since there is no simple relationconnecting the traces of two matrices and the trace of their product matrix,it is necessary to compute traces on both sides of Eq. (12). Let us first corn

pute tr(R'R- 1 R'). Wehave:

....

'" .....n

-1' 2r L-r- t+

nn j",1 nj

Dow

nloa

ded

by [

Uni

vers

ity o

f Y

ork]

at 0

1:26

21

Nov

embe

r 20

14

COMPUTER FACTOR SCORES

(tR . ) 2 £(-1 .2/ 1 ....J )~c1 r + 1 r 1 J• r 1 J •

255

where rijl and riI indicate for i, j = l , ... n, the elements of R- I and R'2,respectively, Cl = max {riil Ii = l ,n}, and £1 (r~1 , ril/i :;6j) is a linear combination of the off-diagonal terms of R- 1R'2 .

By using a similar technique it is possible to show that:

Thus, both traces can be expressed as polynomials in trR' with constant C1,C2, £1' £2, which can be determined on the basis of the experimental values.

The mean square method has shown that the best approximation to theexperimentally constructed graphs is by:

(13)

(14)

From these equations it is easy to check that Eq. (12) necessarily alwaysholds. In fact, let us suppose, to the contrary:

(IS)

Then Eqs. (13), (14), and (15) would imply:

(trR·)2 n

n ( trR 1)3/2

that is,

(trR I \ 1/2 ~ rn

Dow

nloa

ded

by [

Uni

vers

ity o

f Y

ork]

at 0

1:26

21

Nov

embe

r 20

14

256 F. DE SANTIS

which is clearly absurd since n is the maximum of trR'. It is worth noticingthat the equality W= 2 holds when either

(16)

or

(17)

Now, sufficient conditions for Eqs. (16) and (17) to be true are, respectively:

(18)

and

(19)

The equality of Eq. (18) implies (see [1]) that the eigenvalues of R' are notlarge enough to give rise to featuring factors: both errors are very small, but

t ..P'

FIGURE 1 Theoretical and experimental graphs of traces and errors concerning the twoconsidered approximations.

Dow

nloa

ded

by [

Uni

vers

ity o

f Y

ork]

at 0

1:26

21

Nov

embe

r 20

14

COMPUTER FACTOR SCORES 257

unfortunately in a situation in which factor analysis is not actually meaningful. The equality of Eq. (19) implies R' = R, that is a situation in whichfactor analysis is not applicable at all and, therefore, both errors vanish.

In Fig. I graphs of trR', tr(R'R- 1 R'), tr(R'312R- 1 R'3/2)112. et, e~ areindicated as a synthesis of the experimental results of Tables 1-3 and thetheoretical considerations of Eqs. (13) and (14). Continued lines representtheoretical curves. The full dots denote the experimental data of tr(R'R- 1 R')

while the crosses indicate the experimental data of the other consideredtrace. Experimental errors are also represented; the empty circles refer to

el whereas the squares refer to et. The agreement of theoretical conclusions and experimental data suggest that relations in Eqs. (13) and (14) canbe considered, with good approximation, the expressions of tr(R'R- 1 R') andtr(R'3/2 R-2R'3/2 )112 as functions of trR' when positive semidefinite matrices

are taken into account.

REFERENCES

I. Bellman, R. 1974. Introduction to Matrix Analysis. New Delhi: TATAMcGraw-Hill.

2. Della Riccia, G. 1980. Feature Processing by Optimal Factor AnalysisTechniques in Statistical Pattern Recognition. IEEE Trans. Patt. Anal.Mach. Intell.

3. Della Riccia, G., de Santis, F., Sessa, M. 1978. Optimal Factor Analysisand Pattern Recognition. Proc. 1978 Int. Con[. Cybern. Soc. TokyoKyoto, Japan, 1567-1583.

4. Della Riccia, G., de Santis, F. 1982. The Approximation of Factor Scoresin a Factor Analysis Algorithm. Proc. 1982 Europ. Meet. Cybern. Syst.Reas. Vienna, Austria.

5. Gantmacher, F. R. 1960. The Theory ofMatrices, New York: Chelsea.6. Harman, H. H. 1967. Modern Factor Analysis, Chicago: University of

Chicago Press.7. McDonald, R. P., Burr, E. J. 1967. A Comparison of Four Methods of

Constructing Factor Scores. Psychometrica, vol. 32, no. 4, 381-401.

Received Augusr 1983

Request reprints from Filomena de Santis, Istituto di Scienze dell'Informazione, Facolta di Scienze, Universita di Salerno, 841 DO-Salerno, Italy.

Dow

nloa

ded

by [

Uni

vers

ity o

f Y

ork]

at 0

1:26

21

Nov

embe

r 20

14

Documents

ON SOME COMPUTATIONAL METHODS IN FACTOR ANALYSIS