Upload
filomena-de
View
218
Download
1
Embed Size (px)
Citation preview
This article was downloaded by: [University of York]On: 21 November 2014, At: 01:25Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK
Cybernetics and Systems: An International JournalPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/ucbs20
ON SOME COMPUTATIONAL METHODS IN FACTORANALYSISFILOMENA DE SANTIS aa University of SalernoPublished online: 27 Mar 2007.
To cite this article: FILOMENA DE SANTIS (1984) ON SOME COMPUTATIONAL METHODS IN FACTOR ANALYSIS, Cybernetics andSystems: An International Journal, 15:3-4, 247-257, DOI: 10.1080/01969728408927747
To link to this article: http://dx.doi.org/10.1080/01969728408927747
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in thepublications on our platform. However, Taylor & Francis, our agents, and our licensors make no representationsor warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Anyopinions and views expressed in this publication are the opinions and views of the authors, and are not theviews of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should beindependently verified with primary sources of information. Taylor and Francis shall not be liable for any losses,actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoevercaused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
Cybernetics and Systems: An International Journal, 15:247-257, 1984
ON SOME COMPUTATIONAL METHODSIN FACTOR ANALYSIS
FILOMENA DE SANTIS
University of Salerno
The standard method for the estimate of factor scores uses least squareapproximation techniques. Such a method, while ensuring the minimumsquare error on the considered approximation, leads to a factor covariancematrix which is clearly not the same as the reduced one. Continuing previous studies in which the computation of factor scores was consideredwith regard to the best reproduction of correlations among variables, thispaper analyzes a procedure for computing factor scores in such a way thatthe reduced covariance matrix is exactly preserved, as well as the linearityof the approximation function.
INTRODUCTION
Although many research studies involve multivariate observations, sometimeslittle is known about the interrelations between variables, between cases, orbetween variables and cases so that it is often difficult to establish if and howdata are structured.
When the objective of multivariate data analysis is to find a relationshipthat just describes the interrelations between variables, it has been often experienced that factor analysis is well suited to the scope (rather thoroughpresentation of related techniques may be found in [7]).
The numerical implementation of a factor analysis algorithm consists offour main steps: first, the correlation matrix is computed; second, the factorloadings are estimated; third, the factors are rotated; and fourth, the factorscores are computed.
This paper deals with the fourth step: factor solution, indeed, is notuniquely determined; hence, the necessity of an estimate for it. Of course,
Copyright © 1984 by Hemisphere Publishing Corporation 247
Dow
nloa
ded
by [
Uni
vers
ity o
f Y
ork]
at 0
1:26
21
Nov
embe
r 20
14
248 F. DE SANTIS
the best approximation is in the least square sense, although it does notexactly reproduce the correlations to be approximated. However, it is possible to provide a different estimate (see [4]) that, while still remaining alinear combination of the original variables, also reproduces correlationsexactly.
The study of such an estimate is carried out along lines similar to thoseoutlined in [7]. Here the aim is to investigate the behavior of the solutionand to determine lower and upper bounds to the error produced by it as compared to the minimum error associated with the least square approximation.It will be shown that the error thus introduced is at most twice the error inthe least square case, a result that seems to have passed unnoticed; moreover,such upper bound corresponds to situations in which factor analysis is not actually meaningful or applicable at all.
-APPROXIMATION PROCEDURES
In the sequel we shall make reference to the notation of Della Riccia [2] andHarman [6], and set:
x = AF • DU
and
(1)
(2)
In terms of the new variables X' =(X; , X; , ... , X~), linearly dependent onF =[Fj , F2 , ••• , Fm),onehas:
X' :::I. AF
so that Eq. (2) becomes:
(3)
(4)
where R', the covariance matrix of variables X', has clearly rank m < nanddiffers from R only in the diagonal terms, namely in the communalities.
Let now X' in Eq. (3) be a function to be approximated in terms of Xby means of the linear combination
X I '= LX
Dow
nloa
ded
by [
Uni
vers
ity o
f Y
ork]
at 0
1:26
21
Nov
embe
r 20
14
COMPUTING FACTOR SCORES 249
where L is an unknown matrix to be determined in such a way that thesquare error et is minimum. Then it can be proved that
is minimized by:
-1L = R'R
so that
and
2 -1 -1e = IIX· - R' R X II = tr ( R' - R' R R • )
L
Correlations induced by Eq. (5) are:
which are clearly not the same as R'.Factors are finally computed from Eq. (3) as:
Let us now consider the linear estimate:
(5)
(6)
(6bis)
(7)
where V is an unknown matrix [4] . Correlations induced by Eq. (7) are:
which are equal to R' if and only if V is a unitary matrix.
Dow
nloa
ded
by [
Uni
vers
ity o
f Y
ork]
at 0
1:26
21
Nov
embe
r 20
14
250
Factors are computed from Eq. (3) as:
F. DE SANTIS
where A = Mill since, as stated by the factor analysis theory, A and Aarethe column matrices of the R' eigenvectors normalized to unity and corresponding eigenvalues, respectively.
The approximation square error introduced by using Eq. (7) is:
The unknown unitary matrix V is determined by minimizing et.It can be shown by standard techniques that the condition for minimum
f l .o eV IS:
(9)
where M is the symmetric n X n Lagrange multiplier matrix introduced tosolve the Eq. (8) minimization problem constrained to VVI = I.
Equation (9) corresponds to the polar decomposition of a linear operatorin a unitary space: every real matrix B can indeed be always decomposed intoa product of two factors, a real positive semidefinite symmetric matrix Mand a real unitary matrix V. The factor M is always uniquely determined fromEq. (9) by the right multiplication of MV by its transpose, whereas the unitary factor V is uniquely determined from Eq. (9) if and only if the matrix Bis nonsingular. However, in the case of singularity, the solution, although notunique, is given by:
k = 1 ••••• n
where tk is a complete orthonormal system of eigenvectors of the right normof the matrix Band zk is a complete orthonormal system of eigenvectors ofM [5]. From the above mentioned results it follows:
and from Eq. (8)
Dow
nloa
ded
by [
Uni
vers
ity o
f Y
ork]
at 0
1:26
21
Nov
embe
r 20
14
COMPUTING FACTOR SCORES 251
It should be pointed out that the equality Rx v = R', determining the relations in Eqs. (7) and (10), is equivalent to condition (26) in [7].
COMPARISON OF THE ESTIMATES
The estimates for X' considered in the previous sections provide differentkinds of advantages.
Essentially, the least square estimate makes sure that the error introduced in the final representation of the original data is minimum; instead,the estimate by means of the unitary matrix V makes sure that RXV = R',thus leading to a better reproduction of the original correlations according tothe main claim of factor analysis. In fact, recalling Eqs. (4) and (6bis) itfollows:
with an approximation error on R given by
whereas no further error is introduced when use of the estimate by means ofV is made as suggested by
e 2 = tr(R _ R,)2 _ 2tr(R _ ROlD 2 + trD 4 0R)(o
V
Moreover, when a vector representation is adopted for the input data set(namely, when a geometric representation of the input data does not takeplace by N points in the n-space of the variables, but by n points in theN-space of the individuals), the correlation coefficients between two variables, measured as deviates from the respective means, are the cosines of theangle between their vectors in the N-space. Thus, reproducing exactly R'implies that the vector representation of the m factor points in the N-spaceof the individuals will better retain the inherent structure of data by virtueof the improved reproduction of the original correlation matrix.
Finally, we remark that the computational complexity of both algorithms for the estimate of F is the same apart from constant terms. Themost time-consuming operations are, in fact, in both cases the eigenvalue-
Dow
nloa
ded
by [
Uni
vers
ity o
f Y
ork]
at 0
1:26
21
Nov
embe
r 20
14
252 F. DE SANTIS
eigenvector computation and the matrix multiplication. The first operationis usually accomplished by triangular factorization with pivoting: in the caseof symmetric n X n matrices with half-bandwidth k, such operation requiresa computational complexity which is in general o[1/2k(k + l)n] and O(n2
)
for correlation matrices. The estimate of F by means of the unitary matrixV requires a triple resort to the eigenvalue-vector computation subroutine,whereas the least square estimate only requires a single use of it, so that theensuing complexity polynomials will only differ by a constant term. Unfortunately, however, the matrix multiplication cannot be done more efflciently than O(n3
) : again, the number of such operations is larger in the caseof the estimate by V, yet giving rise to a difference in a constant term.
PERFORMANCE GUARANTEESFOR APPROXIMATION ERRORS
The above considerations indicate the necessity to evaluate the performanceof the estimate of F by means of V as compared to the performance of theleast square estimate.
The least square estimate leads to the best representation of X' in termsof the observable X characterized by the minimum error, whereas the estimate by V can be considered as a compromise between the "optimality" ofthe searched solution, and the presence of constraints to the problem. Thus,it is worth considering how close the compromise solution is to the optimalone.
In view of such a task, let us informally define as a solution value for anestimate algorithm, a function that assigns to each instance of the problema positive rational number that decreases as much as the solution improves.Then, taking as solution valuesthe mean square errors, it follows:
2tr(R' _ (R,3/2R-I R,3/2)1/2)
tr(R' _ R'R-1R')(II)
The aim is to find an upper bound to this ratio.Experimental results (Tables 1-3) have suggested that W .;;;; 2, the
equality holding if factor analysis is not actually meaningful or applicableat all.
Let us now try to provide an explanation to such results. First of all letus consider that W .;;;; 2 will hold if and only if
Dow
nloa
ded
by [
Uni
vers
ity o
f Y
ork]
at 0
1:26
21
Nov
embe
r 20
14
TA
BL
E1
Exp
erim
enta
lVal
ues
oftr
R'
ande~/eL
with
n=
10[n
=m
ax(t
rR')
=10
1
TA
BL
E2
Exp
erim
enta
lV
alue
sof
trR
'an
de\
,feL
with
n=
50[n
=m
ax(t
rR')
=50
1
trR
'2
2eV
/eL
0.1
00
00
0+
00
0.1
92
36
0..
-01
0.2
00
00
0..
00
0.1
8·9
99
0.0
1
0.3
00
00
0..
00
0.1
66
24
0+
01
0.4
00
00
0+
00
0.1
84
36
0+
01
0.5
00
00
0+
00
0.1
82
22
0...
01
0.6
00
00
0+
00
0.1
&1
00
0.0
1
O.7
00
00
D..
OO
0.1
79
87
0.0
1
0.8
00
00
0+
00
0.1
77
60
0+
01
0.9
00
00
0..0
00
.17
68
9D
+O
l
0.1
00
00
0.0
10
.17
60
00
.01
0.2
C0
00
D..
01
O.1
69
!J9
D+
OI
u.3
00
00
0..
01
0.1
61
79
0+
01
0.4
00
00
0.0
10
.15
&1
30
+0
1
0.6
00
00
0.0
10
.15
00
0D
.01
0.8
0':
'00
0.0
10
.14
47
20
.01
o.1
00
00
D+
02
0.1
39
99
0+
01
0.2
00
00
0.0
20
.12
52
50
..0
1
0.3
00
00
D+
02
0.1
13
00
0..
01
0..
40
00
0D
+0
20
.10
95
80
..0
1
0.5
00
00
0+
02
0.0
00
02
D_
01
0.1
82
00
0.0
1
0.1
75
91
D+
Ol
0.1
69
52
0.0
1
0.1
65
55
D+
01
0.1
64
31
0.0
1
0.1
60
00
0.0
1
0.1
59
67
0.0
1
0.1
56
81
0.0
1
0.1
54
01
0.0
1
O.1
!i2
43
D.0
1
0.1
39
21
0+
01
0.1
30
01
0.0
1
0.1
22
97
0.0
1
0.1
26
51
D+
01
0.1
17
78
0.0
1
0.1
10
56
0+
01
0.1
09
59
0.0
1
0.1
03
04
30
.01
0.0
00
11
0+
01
--
O.l
QO
OC
D.O
O
0.2
00
00
0.0
0
0.3
00
00
0...
00
0.4
00
00
0.0
0
0.5
00
00
0.0
0
0.6
00
00
0+
00
0.7
00
00
D+
OO
0.8
00
00
0.0
0
0.9
00
00
0+
00
0.1
00
00
0.0
1
0.2
00
00
0.0
1
0.3
00
00
0+
01
0.4
00
00
0.0
1
0.5
00
00
0.0
1
0.6
00
00
0+
01
0.7
00
00
0+
01
0.8
00
00
:J+
Ol
0.9
00
00
0+
01
0.1
00
00
0+
02
trR
'1
---------1
---"---''----
N Ul ...
Dow
nloa
ded
by [
Uni
vers
ity o
f Y
ork]
at 0
1:26
21
Nov
embe
r 20
14
254
TABLE 3 Experimental Values of trR'
and ev/eL with n "" 100[n = max(trR') = 100]
t.rR '2 :1
.,V/eL
0.100000.00 0.194120.01
O. :'00001.• 00 0.169620.01
0.50000,'.00 0.187561>.01
0*:'000Clr·+oo 0.185110001
O.<iOOc.::r.oOCJ 0.1133430.01
O.10000D.·Ol 0.181290001
0.3000r>0.01 0.172340·001
0.50:)000001 0.165010.01
0.70000,1+01 0.160030001
0.900000.01 0.156920.01
0.10000D.02 0.152000.01
0.200000.02 0.139190.01
0.300000.02 0.130220.01
:).400000.02 0.124710001'·
0.500000.02 0,117650.01
0.600000.02 0.112740.01
0.700000.02 0.109880001
0.800000002 0.106000.01
0.900000.02 0.102740.01
0.1000,10.03 0.000010.01
F. DE SANTIS
(12)
Thus the problem is to prove that Eq, (12) is always satisfied when positivesemidefinite matrices are taken into account. Since there is no simple relationconnecting the traces of two matrices and the trace of their product matrix,it is necessary to compute traces on both sides of Eq. (12). Let us first corn
pute tr(R'R- 1 R'). Wehave:
....
'" .....n
-1' 2r L-r- t+
nn j",1 nj
Dow
nloa
ded
by [
Uni
vers
ity o
f Y
ork]
at 0
1:26
21
Nov
embe
r 20
14
COMPUTER FACTOR SCORES
(tR . ) 2 £(-1 .2/ 1 ....J )~c1 r + 1 r 1 J• r 1 J •
255
where rijl and riI indicate for i, j = l , ... n, the elements of R- I and R'2,respectively, Cl = max {riil Ii = l ,n}, and £1 (r~1 , ril/i :;6j) is a linear combination of the off-diagonal terms of R- 1R'2 .
By using a similar technique it is possible to show that:
Thus, both traces can be expressed as polynomials in trR' with constant C1,C2, £1' £2, which can be determined on the basis of the experimental values.
The mean square method has shown that the best approximation to theexperimentally constructed graphs is by:
(13)
(14)
From these equations it is easy to check that Eq. (12) necessarily alwaysholds. In fact, let us suppose, to the contrary:
(IS)
Then Eqs. (13), (14), and (15) would imply:
(trR·)2 n
n ( trR 1)3/2
that is,
(trR I \ 1/2 ~ rn
Dow
nloa
ded
by [
Uni
vers
ity o
f Y
ork]
at 0
1:26
21
Nov
embe
r 20
14
256 F. DE SANTIS
which is clearly absurd since n is the maximum of trR'. It is worth noticingthat the equality W= 2 holds when either
(16)
or
(17)
Now, sufficient conditions for Eqs. (16) and (17) to be true are, respectively:
(18)
and
(19)
The equality of Eq. (18) implies (see [1]) that the eigenvalues of R' are notlarge enough to give rise to featuring factors: both errors are very small, but
t ..P'
FIGURE 1 Theoretical and experimental graphs of traces and errors concerning the twoconsidered approximations.
Dow
nloa
ded
by [
Uni
vers
ity o
f Y
ork]
at 0
1:26
21
Nov
embe
r 20
14
COMPUTER FACTOR SCORES 257
unfortunately in a situation in which factor analysis is not actually meaningful. The equality of Eq. (19) implies R' = R, that is a situation in whichfactor analysis is not applicable at all and, therefore, both errors vanish.
In Fig. I graphs of trR', tr(R'R- 1 R'), tr(R'312R- 1 R'3/2)112. et, e~ areindicated as a synthesis of the experimental results of Tables 1-3 and thetheoretical considerations of Eqs. (13) and (14). Continued lines representtheoretical curves. The full dots denote the experimental data of tr(R'R- 1 R')
while the crosses indicate the experimental data of the other consideredtrace. Experimental errors are also represented; the empty circles refer to
el whereas the squares refer to et. The agreement of theoretical conclusions and experimental data suggest that relations in Eqs. (13) and (14) canbe considered, with good approximation, the expressions of tr(R'R- 1 R') andtr(R'3/2 R-2R'3/2 )112 as functions of trR' when positive semidefinite matrices
are taken into account.
REFERENCES
I. Bellman, R. 1974. Introduction to Matrix Analysis. New Delhi: TATAMcGraw-Hill.
2. Della Riccia, G. 1980. Feature Processing by Optimal Factor AnalysisTechniques in Statistical Pattern Recognition. IEEE Trans. Patt. Anal.Mach. Intell.
3. Della Riccia, G., de Santis, F., Sessa, M. 1978. Optimal Factor Analysisand Pattern Recognition. Proc. 1978 Int. Con[. Cybern. Soc. TokyoKyoto, Japan, 1567-1583.
4. Della Riccia, G., de Santis, F. 1982. The Approximation of Factor Scoresin a Factor Analysis Algorithm. Proc. 1982 Europ. Meet. Cybern. Syst.Reas. Vienna, Austria.
5. Gantmacher, F. R. 1960. The Theory ofMatrices, New York: Chelsea.6. Harman, H. H. 1967. Modern Factor Analysis, Chicago: University of
Chicago Press.7. McDonald, R. P., Burr, E. J. 1967. A Comparison of Four Methods of
Constructing Factor Scores. Psychometrica, vol. 32, no. 4, 381-401.
Received Augusr 1983
Request reprints from Filomena de Santis, Istituto di Scienze dell'Informazione, Facolta di Scienze, Universita di Salerno, 841 DO-Salerno, Italy.
Dow
nloa
ded
by [
Uni
vers
ity o
f Y
ork]
at 0
1:26
21
Nov
embe
r 20
14