Confirmatory Composite Analysis
Florian Schuberth1 Jorg Henseler1
Theo K. Dijkstra2
1University of Twente2Unversity of Groningen
March 16, 2018
SEM Meeting,Amsterdam
Overview
1 Motivation
2 Confirmatory Composite AnalysisModel SpecificationModel IdentificationModel EstimationModel Assessment
3 Monte Carlo simulation
2/16
Latent variables
Type of theoretical construct
Criterion: Latent variable
Dominant statistical model: Common factor model
x2x1 x3
η
λ2λ1 λ3
ε1 ε3ε2
Fundamental scientific question: Does the latent variable exist?Scientific paradigm: PositivismExamples: Abilities, attitudes,
traits
3/16
Artifacts
Many disciplines deal with an interplay of behavioral (latentvariable) and design constructs (artifacts) such as
Discipline Latent variable Artifact
Marketing: Consumer brand attitude Advertising mixCriminology: Intention to commit a crime Prevention strategy
Education: Pupil’s knowledge base Teaching programPsychotherapy: Mental illness Psychiatric treatment
→ How to model these artifacts?
4/16
Two kinds of constructs
Type of theoretical construct
Criterion: Latent variable Artifact
Dominant statistical model: Common factor model Composite model
x2x1 x3
η
λ2λ1 λ3
ε1 ε3ε2
x2x1 x3
c
w2w1 w3
Fundamental scientific question: Does the latent variable exist? Is the artifact useful?Scientific paradigm: Positivism PragmatismExamples: Abilities, attitudes,
traitsIndices, therapies,intervention programs
5/16
Confirmatory Composite Analysis
The confirmatory composite analysis (CCA) consists of 4 steps:
1 Specification of the composite model
2 Identification of the composite model
3 Estimation of the composite model
4 Assessment of the composite model
6/16
Specification of the composite model
y c
x1 x2
z
w2w1
σyc σcz
σ12
σyz
Minimal composite model 7/16
Is this a statistical model?
Consider the model-implied indicator population covariance matrix:
Σ =
y x1 x2 z
σyy
λ1σyc σ11
λ2σyc σ12 σ22
σyz λ1σcz λ2σcz σzz
,
where λ1 = cov(x1, c) and λ2 = cov(x2, c).This matrix has rank-one constraints, which can be exploited instatistical testing.→ Indeed, it is a statistical model
8/16
Identification of the composite model
Identification of composite models is straightforward:1
I Normalization of the weights, e.g., w ′jΣjjw j = 1
I Each composite must be connected to at least one compositeor variable not forming the composite→ All model parameters can be uniquely retrieved from the
population indicator covariance matrix
1We ignore trivial regularity assumptions such as weight vectors consistingof zeros only; and similarly, we ignore cases where intra-block covariancematrices are singular.
9/16
Estimation of the composite model
For determining the weights, several methods have been proposed:I Sum scoresI Expert weightingI Approaches to generalized canonical correlation analysis
(GCCA) such as MAXVAR[Kettenring, 1971]
I Regularized general canonical correlation analysis (RGCCA)[Tenenhaus & Tenenhaus, 2011]
I Partial least squares path modeling (PLS-PM)[Wold, 1975]
I Generalized structured component analysis (GSCA)[Hwang & Takane, 2004]
10/16
Assessment of the composite model
The overall model fit can be assessed in two non-exclusive ways:
I Measures of fit (heuristic rules)
I Test for overall model fit
11/16
Assessment of the composite model
To test the overall model fit, a bootstrap-based test can be used(H0 : Σ = Σ(θ)) [Beran & Srivastava, 1985, Bollen & Stine, 1992]in combination with various discrepancy measures such as
I Standardized root mean squared residual (SRMR)
I Geodesic distance (dG )
I Euclidean distance (dL)
12/16
Is the test for overall model fit capable to detect misspecificationsin the composite model such as
I Wrongly assigned indicators
I Correlations between indicators of different blocks that cannotbe fully explained by the composites→ Monte Carlo simulation, where we use MAXVAR to determine
the weights
13/16
Monte Carlo simulation
Experimental condition Population model Specified model
4) No misspecificationx11 x12 x13
x21 x22 x23
x31 x32 x33
c1
c2
c3
w12 = .4
w11 = .6 w13 = .2
w22 = .5
w21 = .3 w23 = .6
w32 = .5
w31 = .4 w33 = .5
ρ13 = .5
ρ12 = .3 ρ23
= .4
.5.5 .5
.0.2 .4
.4.25 .16
x11 x12 x13
x21 x22 x23
x31 x32 x33
c1
c2
c3
w12
w11 w13
w22
w21 w23
w32
w31 w33
ρ13
ρ12 ρ23
5) Unexplained correlationx11 x12 x13
x21 x22 x23
x31 x32 x33
c1
c2
c3
w12 = .4
w11 = .6 w13 = .2
w22 = .5
w21 = .3 w23 = .6
w32 = .5
w31 = .4 w33 = .5
ρ13 = .5
ρ12 = .3 ρ23
= .4
.5.5 .5
.0.2 .4
.4.25 .16
.166
14/16
Rejection rates
●● ●
●● ●
●● ● ●
● ●● ● ●
●
●
●
●
●
●
●
●
●
●● ● ● ● ●
●● ●
●● ●
●● ● ●
● ●● ● ●
●
●
●
●
●
●
●
●
●
●● ● ● ● ●
●
● ● ● ● ●● ●
● ● ● ●● ●
●
●
●
●
●
●
●
●
●
●● ● ● ● ● ●
dL SRMR dG
Population model 4
Population model 5
50 350 650 950 1250 50 350 650 950 1250 50 350 650 950 1250
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.10
0.11
0.12
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
Sample size
Rej
ectio
n ra
te
Significance level: ●10% 5% 1% 15/16
References
Beran, R. & Srivastava, M.S. (1985)
Bootstrap tests and confidence regions for functions of a covariancematrix
The Annals of Statistics 13(1) 95 – 115.
Bollen, K. A. & Stine, R. A. (1992)
Bootstrapping goodness-of-fit measures in structural equation models
Sociological Methods & Research 21(2) 205 – 229.
Hwang, H. & Takane, Y. (2004)
Generalized structured component analysis
Psychometrika 69(1) 81 – 99.
16/16
References
Kettenring, J.R. (1971)
Canonical analysis of several sets of variables
Biometrika, 58(3), 433 – 451.
Pearson, K. (1901)
On lines and planes of closest fit to systems of points in space
Philosophical Magazine Series 6 2(11) 559 – 572.
Tenenhaus, A. & Tenenhaus, M. (2011)
Regularized generalized canonical correlation analysis
Psychometrika 76(2) 257 – 284.
16/16
References
Wold, A.O.H. (1975)
Path models with latent variables: The NIPALS approach. In H. Blalock,A. Aganbegian, F. Borodkin, R. Boudon, & V. Capecchi (Eds.)
Quantitative Sociology, 307 - 357, New York Academic Press.
16/16