Some Multivariate techniques Principal ibg. Multivariate techniques Principal components analysis (PCA) Factor analysis (FA) Structural equation models (SEM) Applications: Personality

  • View
    215

  • Download
    2

Embed Size (px)

Text of Some Multivariate techniques Principal ibg. Multivariate techniques Principal components analysis...

  • Some Multivariate techniquesPrincipal components analysis (PCA)

    Factor analysis (FA)Structural equation models (SEM)

    Applications: Personality

    Dorret I. BoomsmaDanielle DickMarleen de MoorMike NealeConor Dolan

    BoulderMarch 2006

    Presentation in dorret\2006

  • -Multiple regression -Fixed effects (M)ANOVA -Random effects (M)ANOVA-Factor analysis / PCA -Time series (ARMA)-Path / LISREL models

    Multivariate statistical methods; for example

  • Multiple regressionx predictors (independent), e residuals, y dependent;

    both x and y are observed

    xxxx

    yyy

    e

    ee

  • Factor analysis:measured and unmeasured (latent) variables. Measured variables can be indicators of unobserved traits.

  • Path model / SEM model

    Latent traits can influence other latent traits

  • Measurement and causal models in non-experimental research

    Principal component analysis (PCA) Exploratory factor analysis (EFA) Confirmatory factor analysis (CFA) Structural equation models (SEM) Path analysis

    These techniques are used to analyze multivariate data that have been collected in non-experimental designs and often involve latent constructs that are not directly observed. These latent constructs underlie the observed variables and account for inter-correlations between variables.

  • All models specify a covariance matrix and means vector :

    = t +

    total covariance matrix [] = factor variance [t ] + residual variance []

    means vector can be modeled as a function of other (measured) traits e.g. sex, age, cohort, SES

    Models in non-experimental research

  • Outline

    Cholesky decomposition PCA (eigenvalues) Factor models (1,..4 factors) Application to personality data Scripts for Mx, [Mplus, Lisrel]

  • Application: personality

    Personality (Gray 1999): a persons general style of interacting with the world, especially with other people whether one is withdrawn or outgoing, excitable or placid, conscientious or careless, kind or stern.

    Is there one underlying factor? Two, three, more?

  • Personality: Big 3, Big 5, Big 9?

    Big 3 Big5 Big 9 MPQ scales

    Extraversion Extraversion Affiliation Social ClosenessPotency Social PotencyAchievement Achievement

    Psychoticism Conscientious Dependability Control

    Agreeableness Agreeableness Aggression

    Neuroticism Neuroticism Adjustment Stress ReactionOpenness Intellectance Absorption

    IndividualismLocus of Control

  • Software scripts Mx MxPersonality (also includes data) (Mplus) Mplus (Lisrel) Lisrel

    Copy from dorret\2006

    Data:Neuroticism, Somatic anxiety, Trait Anxiety, Beck Depression, Anxious/Depressed, Disinhibition, Boredom susceptibility, Thrill seeking, Experience seeking, Extraversion, Type-A behavior, Trait Anger, Test attitude (13 variables)

  • Cholesky decomposition: S = Q Qwhere Q = lower diagonal (triangular)

    For example, if S is 3 x 3, then Q looks like:

    f1l 0 0f21 f22 0f31 f32 f33

    I.e. # factors = # variables, this approach gives a transformation of S; completely determinate.

    Cholesky decomposition for 13 personality traits

  • Subjects: Birth cohorts (1909 1989)

    year of birth

    1988,00

    1984,00

    1980,00

    1976,00

    1972,00

    1968,00

    1964,00

    1960,00

    1956,00

    1952,00

    1948,00

    1944,00

    1940,00

    1936,00

    1932,00

    1928,00

    1924,00

    1920,00

    1909,00

    Cou

    nt

    1000

    800

    600

    400

    200

    0

    sex

    female

    male

    Four data sets were created:

    1 Old male (N = 1305)2 Young male (N = 1071)3 Old female (N = 1426)4 Young female (N = 1070)

    What is the structure of personality?Is it the same in all datasets?

    Total sample: 46% male, 54% female

  • 195294468685942360447457528Total

    19503521598Spouse of twin

    303314687976961071Mother

    2739402725664955Father28473236118441069Sib89534468671145147121892835TwinTotal6x5x4x3x2x1x

    Application: Analysis of Personality in twins, spouses, sibs, parents from Adult Netherlands Twin Register: longitudinal participation

    Data from multiple occasions were averaged for each subject;Around 1000 Ss were quasi-randomly selected for each sex-age group

    Because it is March 8, we use data set 3 (personShort sexcoh3.dat)

  • dorret\2006\Mxpersonality (docu.doc) Datafiles for Mx (and other programs; free format) personShort_sexcoh1.dat old males N=1035 (average yr birth 1943) personShort_sexcoh2.dat young males N=1071 (1971) personShort_sexcoh3.dat old females N=1426 (1945) personShort_sexcoh4.dat young females N=1070 (1973)

    Variables (53 traits): (averaged over time survey 1 6)trappreg trappext sex1to6 gbdjr twzyg halfsib id_2twns drieli: demographicsneu ext nso tat tas es bs dis sbl jas angs boos bdi : personalityysw ytrg ysom ydep ysoc ydnk yatt ydel yagg yoth yint yext ytot yocd: YASRcfq mem dist blu nam fob blfob scfob agfob hap sat self imp cont chck urg obs com: other

    Mx Jobs Cholesky 13vars.mx : cholesky decomposition (saturated model) Eigen 13vars.mx: eigenvalue decomposition of computed correlation matrix (also

    saturated model) Fa 1 factors.mx: 1 factor model Fa 2 factors.mx : 2 factor model Fa 3 factors.mx: 3 factor model (constraint on loading) Fa 4 factors.mx: 1 general factor, plus 3 trait factors Fa 3 factors constraint dorret.mx Fa 3 factors constraint dorret.mx: alternative constraint to identify the model

  • title cholesky for sex/age groupsdata ng=1 Ni=53 !8 demographics, 13 scales, 14 yasr, 18 extramissing=-1.00 !personality missing = -1.00rectangular file =personShort_sexcoh3.datlabelstrappreg trappext sex1to6 gbdjr twzyg halfsib id_2twns drieli neu ext nso etc.

    Select NEU NSO ANX BDI YDEP TAS ES BS DIS EXT JAS ANGER TAT /begin matrices;A lower 13 13 free !common factorsM full 1 13 free !meansend matrices;

    covariance A*A'/means M /start 1.5 all etc.option nd=2end

  • NEU NSO ANX BDI YDEP TAS ES BS DIS EXT JAS ANGER TAT /

    MATRIX A: This is a LOWER TRIANGULAR matrix of order 13 by 13

    23.74 3.55 4.42 6.89 0.96 5.34 1.70 0.72 0.80 2.36 2.79 0.32 0.68 -0.08 2.87 -0.30 0.03 -0.01 0.16 0.18 7.11 0.28 0.13 0.17 -0.04 0.24 3.32 6.03 1.29 -0.08 0.30 -0.15 -0.09 0.96 1.52 6.01 0.83 -0.07 0.35 -0.30 0.15 1.97 0.91 1.16 5.23 -4.06 -0.11 -1.41 -0.20 -0.90 2.04 1.07 3.14 0.94 14.06 1.85 -0.02 0.70 -0.28 0.01 0.47 0.00 0.43 -0.08 1.11 3.98 1.86 -0.09 0.80 -0.49 -0.18 0.13 0.04 0.21 0.18 0.51 0.97 3.36 -1.82 0.16 -0.34 0.02 -1.26 -0.16 -0.46 -0.80 -0.53 -1.21 -1.20 -1.64 7.71

  • To interpret the solution, standardize the factor loadings both with respect to the latent and the observed variables.In most models, the latent variables have unit variance;standardize the loadings by the variance of the observed variables (e.g. 21 is divided by the SD of P2)

    F1 F2 F3 F4 F5

    P1 P2 P3 P4 P5

  • Group 2 in Cholesky scriptCalculate Standardized SolutionCalculationMatrices = Group 1I Iden 13 13End Matrices;

    Begin Algebra;S=(\sqrt(I.R))~; ! diagonal matrix of standard deviationsP=S*A; ! standardized estimates for factors loadingsEnd Algebra;

    End

    (R=(A*A'). i.e. R has variances on the diagonal)

  • Standardized solution: standardized loadings

    NEU NSO ANX BDI YDEP TAS ES BS DIS EXT JAS ANGER TAT /

    1.00 0.63 0.78 0.79 0.11 0.61 0.55 0.23 0.26 0.76 0.69 0.08 0.17 -0.02 0.70 -0.04 0.00 0.00 0.02 0.03 0.99 0.04 0.02 0.02 -0.01 0.04 0.48 0.87 0.20 -0.01 0.05 -0.02 -0.01 0.15 0.24 0.94 0.14 -0.01 0.06 -0.05 0.02 0.34 0.15 0.20 0.89 -0.27 -0.01 -0.09 -0.01 -0.06 0.13 0.07 0.21 0.06 0.92 0.40 0.00 0.15 -0.06 0.00 0.10 0.00 0.09 -0.02 0.24 0.86 0.45 -0.02 0.19 -0.12 -0.04 0.03 0.01 0.05 0.04 0.12 0.24 0.82 -0.22 0.02 -0.04 0.00 -0.15 -0.02 -0.05 -0.09 -0.06 -0.14 -0.14 -0.19 0.91

  • NEU NSO ANX BDI YDEP TAS ES BS DIS EXT JAS ANGER TAT /

    Your model has104 estimated parameters : 13 means 13*14/2 = 91 factor loadings -2 times log-likelihood of data >>>108482.118

  • Eigenvalues, eigenvectors & principal component analyses (PCA)

    1) data reduction technique2) form of factor analysis3) very useful transformation

  • Principal components analysis (PCA)

    PCA is used to reduce large set of variables into a smaller number of uncorrelated components.

    Orthogonal transformation of a set of variables (x) into a set of uncorrelated variables (y) called principal components that are linear functions of the x-variates.

    The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible.

  • Principal component analysis of 13 personality / psychopathology inventories: 3 eigenvalues > 1

    (Dutch adolescent and young adult twins, data 1991-1993; SPSS)

    0

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4

    Eigenvalue

  • Principal components analysis (PCA)PCA gives a transformation of the correlation matrix R and is a completely determinate model.

    R (q x q) = P D P, where P = q x q orthogonal matrix of eigenvectorsD = diagonal matrix (containing eigenvalues)

    y = P x and the variance

Recommended

View more >