Multilevel Factor Analytic Models

Embed Size (px)

Citation preview

  • 8/13/2019 Multilevel Factor Analytic Models

    1/21

  • 8/13/2019 Multilevel Factor Analytic Models

    2/21

    238 L. Diya, B. Li, K. Van den Heede, W. Sermeus and E. Lesaffre

    are used as proxies of these constructs; for instance a single adverse event, say mortality, can

    be used to represent patient safety. However, univariate analyses fall short in addressing the

    multi-dimensional facets of most healthcare measures (Diya et al., 2011); hence multivariate

    responses can be used. The relationship between a multivariate response vector and covariates

    of interest can then be established by using a multivariate analysis. A special kind of multivariate

    analysis which is gaining ground in the field of health outcomes and nursing research is factoranalysis (FA) (Lake, 2002; Bruyneelet al., 2009; Gajewskiet al., 2010). FA is used to quantify

    the relationship between a multivariate response vector and a latent construct (or its domains),

    i.e. in an FA model the multivariate response vector is regressed on latent variables and covari-

    ates. Examples of the use of FA can be found in nursing research (Lake, 2002), bioinformatics

    (Hochreiteret al., 2006), psychology (Norris and Lecavalier, 2010) and educational research

    (Naicker, 2010).

    In this paper the healthcare measure of interest is patient safety. Patient safety is defined as

    the absence of the potential for, or occurrence of, health care associated injuries to patients that occursby avoiding medical errors as well as by taking actions to prevent errors from causing injury

    (Agency for Healthcare Research and Quality, 2004). Since patient safety is a latent construct,

    nurse-reported adverse events obtained from the Belgian chapter of the Europe Nurses Forecast-

    ing Survey will be used as its proxy. Adverse events are injuries caused by medical management

    rather than by the underlying disease conditions (Van den Heede, 2008). In the survey several

    adverse events are recorded from nurses (nested within nursing units within hospitals), i.e. not

    only do we have multivariate responses but also the data are multilevel in structure. Neglecting

    the multilevel structure of the data in the analysis can be problematic since the fundamental

    independence assumption underlying many commonly used statistical techniques, including FA

    techniques, is violated (Goldstein and McDonald, 1988). When the degree of dependence is sub-

    stantial, violation of the independence assumption biases parameter estimates (and standarderrors) and affects the power of statistical tests (Bliese and Hanges, 2004; Kenny and Judd,

    1986). Using wrongly single-level FA may lead to the atomistic fallacy, which is incorrectly

    assuming that the relationship between variables observed at the observational level also holds

    at the cluster level. Aggregating the data at the cluster level does lead to independent observa-

    tions but at the risk of committing the ecological fallacy, which is incorrectly assuming that the

    relationship between variables at the cluster level also holds at the observational level (Robinson,

    1950). So, to minimize the chances of these fallacies and to attain valid inferences for multilevel

    data, models which take the multilevel structure of the data into account should be considered

    (Longford and Muthen, 1992; Ansari and Jedidi, 2000; Goldstein and Browne, 2005; Grilli and

    Rampichini, 2007). Another consequence of neglecting the multilevel structure of the data inFA is that an invalid factor structure can be proposed and the interpretation of the inferences

    in FA is level dependent.

    It might be useful to check that there is a multilevel structure in the data before embarking

    on the more complex multilevel FA modelling. Univariate techniques such as the intraclass

    correlation coefficient ICC or the F-discrepancy measure that was proposed by Yan and

    Sedransky (2007), in a Bayesian context, can be used to verify the multilevel structure quickly.

    ICC is the ratio of the variation between the clusters (level 2 units) and the total variation of

    the observed or latent (for categorical observed responses) response. ICC-values range from

    0 to 1 with larger values indicating greater variability between clusters compared with within-

    cluster variability. As a rule of thumb, when a response has ICC>0:10 then we can argue that

    there is enough level 2 variability to warrant a multilevel analysis (Dyer et al., 2005; Gajewski

    et al., 2010). The F-discrepancy measure that was proposed by Yan and Sedransky (2007)

  • 8/13/2019 Multilevel Factor Analytic Models

    3/21

    Multilevel Factor Analytic Models 239

    uses analysis-of-variance (ANOVA) concepts in a Bayesian context to diagnose the need to

    account for a hierarchical structure. This tool may be more interesting in multilevel data sets

    like the Europe Nurse Forecasting Survey data with more than two levels of analysis. However,

    in the nurse survey data interest is on a multivariate response; hence the F-discrepancy mea-

    sure neglects the dependence between responses. We propose here to extend this measure to the

    multivariate setting by making use of multivariate analysis-of-variance (MANOVA) concepts.The objective of this paper is to fit a multilevel FA model to establish the relationship between

    patient safety and nurse-reported adverse events from the Belgian nurse survey data, and to

    develop Bayesian diagnostic techniques for detecting hierarchical structures in multivariate

    data sets.

    The remainder of this paper is organized as follows. Section 2 describes the Europe Nurse

    Forecasting Survey data set, with particular emphasis on the Belgian chapter. A brief review

    of FA is presented in Section 3 and in Section 4 model checking is presented. In Section 5 the

    application of the Bayesian diagnostic techniques on the Belgian nurse survey data is presented.

    The paper ends with a discussion in Section 6. Details pertaining to a simulation study to com-

    pare the MANOVA and the ANOVA discrepancy measures, and details of the modelling fittingfor FA models (MPLUS and JAGS codes) are provided in the on-line supplemental document

    and available from

    http://www.blackwellpublishing.com/rss

    2. The Europe Nurse Forecasting Survey data set

    The Europe Nurse Forecasting project aims at introducing innovative forecasting methods by

    addressing not only volumes but also quality of staff and its effect on patient care. The project

    involves researchers from 12 European countries, the USA, China, South Africa and Botswana.

    Data collection was only done in 12 European countries, namely Belgium, Finland, Germany,Greece, Ireland, Poland, Spain, Sweden, Switzerland, the Netherlands, the UK and Norway.

    Most of the data were collected in 2009 and 2010.

    In the project two multicountry surveys were conducted, namely the nurse survey and patient

    satisfaction survey. Hospital discharge data were also collected. The various data sets were

    targeted at investigating how nursing qualifications, demographics, workload, wellbeing and

    practice environment can affect productivity, patient safety and patient outcomes.

    This research focuses on the Belgian chapter of the nurse survey data. It contains information

    about the nature of a nurses job (e.g. nursephysician relationships), quality and safety issues,

    the nurses most recent shift and demographics. The nurse survey seeks to establish the perception

    of nurses with regard to their work conditions and their relationship with either patients or otherhealthcare professionals. In particular the focus is on relating nurse-reported adverse events to

    the characteristics of the nurse, nursing unit and hospital.

    The main question that we looked at here was How often would you say each of the fol-

    lowing incidents occurs involving you or your patients?. For this question six incidents were

    considered:

    (a) patients received the wrong medication, time or dose,

    (b) pressure ulcers after admission,

    (c) patient falls with injury,

    (d) urinary tract infections,(e) bloodstream infections and

    (f) pneumonia.

  • 8/13/2019 Multilevel Factor Analytic Models

    4/21

    240 L. Diya, B. Li, K. Van den Heede, W. Sermeus and E. Lesaffre

    The main question (for each incident type) had seven potential answers measured on a

    Likert scale, i.e. 0, never, 1, a few times a year or less, 2, once a month or less, 3, a few

    times a month, 4, once a week, 5, a few times a week, and 6, every day. In this paper new

    responses were generated by discretizing the Likert-scaled variables because of software con-

    siderations. (It is currently not feasible to implement the proposed latent variable or data

    augmentation approach for categorical responses with more than two categories in JAGS.)A nurse was judged to have indicated that his or her nursing unit had a substantial num-

    ber of events if he or she selected the values greater than 2. This led to binary responses,

    namely

    (a) wrong medication, time or dose (wrong),

    (b) pressure ulcers (pressure),

    (c) patient falls with injury (falls),

    (d) urinary tract infections (uti),

    (e) bloodstream infections (infections) and

    (f) pneumonia.

    The data that are considered here are multilevel in structure, i.e. nurses are nested within

    nursing units which are in turn nested within hospitals. The data consisted of 3151 nurses

    from 269 nursing units in 66 hospitals. 23 and 43 hospitals respectively were and were not

    technologically advanced and four and 62 hospitals respectively were and were not univer-

    sity hospitals. Of the 269 nursing units, 112 were surgical units, 125 were medical units and

    32 were mixed surgical and medical units. Of the nurses who responded to the nurse survey,

    1333 worked in surgical units, 1471 worked in the medical units and 347 worked in mixed

    surgicalmedical units. 2167 of the nurses reported Flemish as their language of communica-

    tion whereas 984 reported French. The covariates of interest in this paper were measured at

    the nurse level, nursing unit level and hospital level. At the hospital level the covariates werethe number of beds (beds), whether the hospital is technologically advanced (tech = 1; other-

    wise tech = 0) and whether the hospital was a university hospital (university = 1; otherwise

    university = 0). At the nursing unit level the covariate of interest was the unit specification (1,

    surgical, 2, medical, and 3, mixed). At the nurse level the covariates considered were gender

    (male = 1 if gender is male; 0 otherwise), age, time in years as a registered nurse (expn), time

    in years at the current hospital (exph), language (language = 1 if Flemish; 0 otherwise) and

    whether a nurse was satisfied with his or her career choice (satisfied = 1 if a nurse was satisfied;

    0 otherwise).

    In this paper we shall focus on the nurse and the nursing unit level. The hospital level (third

    level) is dropped because empirical evidence did not support its inclusion, i.e. there was lowvariability at the hospital level; see Section 5.2.2. Thus the role of hospitals is explored in a

    non-hierarchical way.

    3. Factor analysis models

    3.1. Single-level factor analysis

    We first consider a single-level FA model for continuous responses. Let Yj= .Yj1, . . . ,Yjp/be a

    p-dimensional continuous response vector forthejth individual (j=1 , . . . ,N),Fj= .Fj1, . . . ,Fjk/

    be a k-dimensional (kp) vector of latent constructs and be apkmatrix of factor loadings;

    then the single-level FA model can be written as

    Yj=j+Fj+j, .1/

  • 8/13/2019 Multilevel Factor Analytic Models

    5/21

    Multilevel Factor Analytic Models 241

    where j= .j1, . . . ,jp/ is the mean vector, the common factors are multivariate normally

    distributed with mean 0 and covariance matrix .FjNk.0,// and the unique factors are

    multivariate normally distributed with mean 0 and covariance matrix (jNp.0,/). Note

    that the mean of thesth (s=1 , . . . ,p) responseYjs can be modelled as a function of covariates,

    i.e. js=Xjs. Factor loadings (elements of) are parameters which relate the unobserved

    factors to the observed responses, i.e., the larger the value of a factor loading, the more thecorresponding response is said to load on the corresponding factor. Factor loadings are allowed

    to vary across survey items but not across individuals. Assuming orthogonality of all common

    and unique factors, the marginal covariance matrix is given by var.Yj/=U=+, i.e.

    the observed covariances matrix can be decomposed into a component that is attributable to

    the underlying factors and measurement error.

    Two special cases of FA are

    (a) ifFjis known then the FA model can be viewed as a multivariate regression model with

    the factor loadings playing the role of regression coefficients and

    (b) if the factor loadings are known then the FA model is equivalent to a two-level random-

    slope model.

    There are two modes of FA, namely exploratory FA (EFA) and confirmatory FA (CFA).

    EFA is used when there is no a priori information about the factor structure, i.e. choosing

    the number of factors, k, and establishing the variables that are related to a given factor

    domain. In that case all factor loadings are free parameters. In both the frequentist and

    the Bayesian contexts, EFA can be done by fitting several models with varying numbers of

    factors and using goodness-of-fit indices (discrepancy measures) to select the best model (see

    Section 4.2). In a Bayesian context, Lopes and West (2004) advocated the use of the reversible

    jump Markov chain Monte Carlo (MCMC) approach. The ultimate objective of EFA is to

    come up with the best factor structure for a given data set. In contrast, CFA is used tovalidate a priori hypothesized factor structures. In the literature, the most common CFA

    structure is the simple factor structure, i.e. the factor loadings are allowed to load on one

    and only one latent construct and all the other factor loadings (cross-loadings) are set equal to

    0.

    The above FA model caters for continuous responses assumed to be multivariate normally

    distributed. For categorical or binary responses, we could assume that they are discretized

    versions of latent Gaussian variables, i.e. consider the binary response vector given by Yj; then

    Yjs = I.Yjs>0/ where Y

    js, s= 1 , . . . ,p, is a latent Gaussian response. For identifiability, the

    variance of the error term ("js) for the sth response, ss, is set to 1 such that all the diago-

    nal elements of are 1. The matrix is called the polychoric or tetrachoric correlation matrixas it is the correlation matrix of the underlying latent Gaussian responses.

    Additional constraints are required for both categorical and continuous response FA models

    to be identifiable. For instance, for the single-level FA in equation (1) there arep.p+1/=2 unique

    elements in but as many aspk+k.k+1/=2+p.p+1/=2 elements in+, implying that

    there are more parameters to estimate than availableinformation. In this paper we shall constrain

    the measurement error for different responses to be independent (off-diagonal elements of

    equal to 0) and the common factors to be orthogonal. In CFA, an additional constraint that is

    employed is to adopt a simple factor structure, i.e. a variable loads on only one factor, whereas

    in an EFA this constraint is relaxed.

    FA models can be fitted by using maximum likelihood estimation or (robust) weighted least

    squares estimation; see Longford (1993). From equation (1), the total marginal likelihood of

    the data is

  • 8/13/2019 Multilevel Factor Analytic Models

    6/21

    242 L. Diya, B. Li, K. Van den Heede, W. Sermeus and E. Lesaffre

    L.,,,|Y/=

    .2/pN=2|U|

    N=2 exp

    1

    2

    Nj=1

    .Yj/1U .Yj/

    f.Fj/dFj, .2/

    where=X. For binary or categorical data the (latent) tetrachoric or polychoric correlation

    matrix can be used. Computational strategies for obtaining maximum likelihood estimates have

    been discussed in Mardia et al. (1979) and Lawley and Maxwell (1971). Another estimationtechnique which can be used is weighted least squares.

    Alternatively, one could estimate the parameters of FA models by using the Bayesian

    approach. Estimation is done via sampling using MCMC techniques. In the Bayesian approach

    the likelihood is combined with prior information to obtain the posterior distribution of the

    parameters of interest, i.e.

    p.,,,|Y/L.,,,|Y/p.,,,/, .3/

    where p.,,,/ is the prior distribution. For an FA on binary responses, we used in this

    paper the latent variable technique called the data augmentation approach (Albert and Chib,

    1993; Azevedo etal., 2011), where a latent Gaussianresponse vector is assumed that is discretized

    to yield the binary responses. The complete marginal data likelihood function for the binary

    single-level FA model is given by

    L.,,,|Y/

    Nj=1

    ps=1

    .Yjs;js; 1/.Y

    js|Yjs/f.Fj/f.Y

    /dFjdY, .4/

    where

    .Yjs|Yjs/= I.Yjs>0/I.Yjs=1/+ I.Yjs0/

    I.Yjs=0/

    is the conditional likelihood..Yjs;js; 1/ represents the value of the normal probability density,

    with meanjs and standard deviation SD= 1, at the point Yjs with

    js being the conditional

    mean, i.e. js = js + sFj, where s is the sth row of. Note that the distribution of the

    multivariate response vector can now be presented as a product of univariate distributions

    because responses are conditional independent (conditional on the common factor).

    The EFA models in this paper used the frequentist approach and were implemented by using

    MPLUS (Muthen and Muthen, 2011). A frequentist approach was necessary since methods

    for Bayesian covariance selection are scarce and computationally intense. CFA models can be

    easily fitted by using both the frequentist and the Bayesian approaches (using link functions) in

    MPLUS. However, these CFA models are restricted in term of flexibility. Hence, the CFA modelsin this paper were fitted by using the Bayesian approach, i.e. we used the data augmentation

    approach called the latent variable approach in JAGS (Plummer, 2003). In particular, JAGS

    was run from R (R Development Core Team, 2012) by using the R2jags (Su and Yajima, 2012)

    package. The R packageCODAwas used to assess MCMC convergence (Plummeret al., 2006).

    In this paper we restricted the convergence diagnosis to the trace plots and the Gelman, Brooks

    and Rubin diagnostic R. The quick mixing of MCMC chains and values ofR close to 1 (say

    R

  • 8/13/2019 Multilevel Factor Analytic Models

    7/21

    Multilevel Factor Analytic Models 243

    in a CFA the cross-loadings can be given priors concentrated around 0 instead of fixing them

    to 0.

    The data augmentation approach has interesting features because it replaces the binary

    response with a latent continuous response; therefore statistical techniques which are used for

    continuous responses can be applied to the latent continuous response. In addition, it allows

    quite general link functions by replacing the classical link functions (e.g. normal distributions(probit link)) by quite general forms of distributions (e.g. skewed normal distributions).

    3.2. Multilevel factor analysis

    When the data have a multilevel structure, multilevel FA approaches are needed. This implies

    that the multivariate continuous response vector for an individual has attributes emanating

    from the observational and the cluster levels. Let Yij= .Yij1, . . . ,Yijp/ be a p-dimensional

    continuous response vector for the jth individual (j= 1 , . . . ,ni) in cluster i (i= 1 , . . . ,m),

    F.1/ij = .F

    .1/ij1 , . . . ,F

    .1/ijk1

    /be ak1-dimensional (k1p) vector of latent constructs at level 1,F.2/i =

    .F.2/

    i1, . . . ,F

    .2/

    ik2/ be a k

    2-dimensional (k

    2p) vector of latent constructs at level 2,.1/ be a pk

    1matrix of factor loadings at level 1 and.2/ be apk2matrix of factor loadings at level 2; then

    the multilevel FA model is given by

    Yij=ij+.1/F.1/ij +

    .1/ij +

    .2/F.2/i +

    .2/i , .5/

    where ij is the mean vector. The vector of unique factors or measurement errors at the

    observational level is multivariate normally distributed with mean 0 and variance .1/ (.1/ij

    Np.0,.1//), the vector of unique factors or measurement errors at the cluster level is multi-

    variate normally distributed with mean 0 and variance .2/ (.2/i Np.0,

    .2//), the vector of

    common factors at the observational level is multivariate normally distributed with mean 0

    and variance .1/

    (F

    .1/

    ij Nk1.0,.1/

    /) and the vector of common factors at the cluster levelis multivariate normally distributed with mean 0 and variance .2/(F

    .2/i Nk1.0,

    .2//).

    Assuming orthogonality of all common and unique factors, the marginal covariance matrix

    consists of a within and between component, var.Yij/=M=W +B. The within-covariance

    matrix is given by W =.1/

    .1/

    .1/+.1/ and the between-covariance matrix is given by

    B =.2/

    .2/

    .2/+.2/. Note that both the within and the between covariances have a com-

    ponent coming from the common factors (factor scores) and unique factors (measurement

    errors or random effects). Multilevel FA can be estimated by using the frequentist approach

    but this requires multi-dimensional integration to obtain a marginal likelihood which will then

    be optimized to obtain parameters of interest. The Bayesian approach will bypass the multi-

    dimensionalintegration by usingsampling-based algorithms(Ansari and Jedidi, 2000; Goldsteinand Browne, 2005).

    4. Model checking

    After the model has been fitted it is necessary to monitor the quality of the statistical model. This

    implies that we are interested in detecting whether there are any systematic differences between

    the model and the observed data. In a frequentist context this entails checking whether certain

    functions of the data, test statistics, follow the distribution that is implied by the model. In a

    Bayesian context, posterior predictive checks (PPCs) are used. PPCs are obtained by generating

    replicated data sets from the posterior predictive distribution of the fitted model (Gelmanet al.,

    2004). These data sets are then compared with the observed data set with respect to the feature of

    interest by using discrepancy measures. We first discuss how to check for the multilevel structure

  • 8/13/2019 Multilevel Factor Analytic Models

    8/21

    244 L. Diya, B. Li, K. Van den Heede, W. Sermeus and E. Lesaffre

    in Section 4.1, followed by model assessment (e.g. normality of both the common and the unique

    factors, and the orthogonality of the common and unique factors) in Section 4.2.

    4.1. Detecting the multilevel structure

    The ICC-values can be used to assess informally whether there is substantial dependence to

    warrant the use of multilevel models. In contrast, Yan and Sedransky (2007) proposed a Bayesian

    approach for checking a multilevel structure in the data by making use of anF-statistic inspired

    by classical ANOVA concepts. When there are two or more responses the approach of Yan and

    Sedransky (2007) neglects to take the dependence between responses into account, thus leading

    to a reduction in power to detect hierarchical structures. We propose the use of diagnostic checks

    based on MANOVA.

    To test for the independence assumption (no hierarchical structure) the relative sizes of the

    between (or total) and within sums of squares and cross-products are considered. In a frequentist

    context, the following test statistics are considered:

    (a) the LawleyHotelling trace,LH = tr.BW1/;

    (b) Pillais trace,PT = tr.BT1/;

    (c) Wilks , WL =|W|=|T|.

    Here the total, between and within sums of squares and cross-products are given by T=

    mi=1

    nij=1 .Yij

    Y/.YijY/ ,B=mi=1ni.Yi

    Y/.YiY/ andW=mi=1

    nij=1.Yij

    Yi/.Yij

    Yi/ respectively.

    Assuming multivariate normality of the responses, all test statistics are asymptotically equiv-

    alent. When comparing two clusters all the test statistics reduce to a statistic which follows a

    Hotelling T2 sampling distribution. Wilks WL is the most frequently used MANOVA test

    statistic as it is related to the likelihood ratio criterion and its exact sampling distribution

    can be derived for some special cases; for example see chapter 6 of Johnson and Wichern(2002).

    In this paper we propose to use these test statistics as discrepancy measures in a Bayesian

    context. The posterior predictiveP- (PPP-) values for the three multivariate discrepancies are

    computed by taking the proportion of times that the predicted discrepancy exceeds the observed

    discrepancy. If only the data structure is misspecified and not other model assumptions then

    our approach does indeed diagnose the hierarchical structure. In cases where other model

    assumptions do not hold then the discrepancy measures proposed might indicate a statistically

    important difference even when the between-cluster variability is low. In such cases our

    approach acts as a global goodness-of-fit check. Caution must be exercised when using PPCs

    or PPP-values since the data are used twice, i.e. in fitting the model and in model checking,hence leading to some degree of bias (Bayarri and Castellanos, 2004). This can be avoided by

    using a learning and a validation data set, where the learning data set is used to estimate the

    model and a validation data set is used to assess the model fit. We employed this approach in

    this paper.

    The approach proposed can be applied only to continuous multivariate responses. It cannot

    be directly applied to categorical multivariate responses, i.e. when using link functions, but it

    can be applied to the latent continuous response vector when the data augmentation approach

    is used.

    4.2. Model assessment

    After having checked the multilevel structure in the data, we can either select the single-level

  • 8/13/2019 Multilevel Factor Analytic Models

    9/21

    Multilevel Factor Analytic Models 245

    or multilevel FA depending on the evidence that is presented. The next step entails model

    assessment, for instance checking the normality assumption of both the common and the unique

    factors, and validating the orthogonality assumption. In this paper we shall adhere to global

    model checks using PPCs in a Bayesian context and variants of the 2 likelihood ratio statistic

    in the frequentist context.

    Recall that the general goal of an FA model is to test the hypothesis that the observed covari-ance matrix is equal to the model-based covariance matrix, i.e.

    S=./, .6/

    where S represents the covariance matrix of the observed response vector (a latent continu-

    ous response vector for categorical data) and ./ represents the model-based covariance

    matrix.

    In a frequentist context, a multitude of goodness-of-fit indices have been proposed. In this

    paper we shall look at three indices which are functions of the 2 likelihood ratio statistic, i.e.

    the2 likelihood ratio statistic, the comparative fit index CFI (Bentler, 1990) and the Tucker

    Lewis index TLI (Tucker and Lewis, 1973) and two indices which are based on how well agiven model approximates the true model, i.e. the root mean square of approximation RMSEA

    (Browne and Cudeck, 1992; Steiger and Lind, 1980) and the standardized root-mean-square

    residual SRMSR. These indices check whether the model-based covariance matrix is close to

    the covariance matrix of the observed response vector. If the model fits the data well then

    (a) the2 likelihood ratio statistic should be small,

    (b) both CFI and TLI should have a value 1 and

    (c) RMSEA and SRMSR should be close to 0.

    As a rule of thumb, a model with CFI- and TLI-values of at least 0.90 are deemed acceptable

    and values of at least 0.95 are indicative of a good fit. Alternatively, we could use RMSEA andSRMSR for which values less than 0.08 indicate an acceptable model fit and values less than

    0.05 are indicative of a good fit (Hu and Bentler, 1999). Multiple fit indices are used because

    each fit index has its own limitations and there is no agreed-on method for evaluating whether

    the lack of fit of a model is substantively important. These indices are used for both single and

    multilevel FA models.

    In a Bayesian context, model assessment can focus on the likelihood and the priors. PPCs

    are used to check how well the model fits the data; however, the discrepancy measures are now

    different from those used in diagnosing the neglected hierarchical structure. From these PPCs,

    PPP-values can be computed. In this paper the2 discrepancy measure (Gelman et al., 1996)

    was used. The deviance information criterion DIC (Spiegelhalter et al., 2002) can be used tocompare models on the basis of different likelihood and prior specifications. The model having

    the smallest DIC-value is then a good candidate; see Spiegelhalter et al.(2002). However, this

    measure can be misleading as latent variables, e.g. factor scores and randomeffects, are treated as

    parameters. This has led to numerous discussions in the literature with no conclusive suggestions

    (Celeux et al., 2006); hence we shall use the mean-square prediction error MSPE instead. MSPE

    measures how well the model fits the data and is given by

    MSPE.Y/=E

    mi=1

    n1j=1

    .Yij Yij/2

    ,

    where Yij

    is the posterior predicted response for nurse jin clusteri. Note that all model assess-

    ments are carried out on a validation data set (independent of the learning data set that is used

    to fit the model).

  • 8/13/2019 Multilevel Factor Analytic Models

    10/21

    246 L. Diya, B. Li, K. Van den Heede, W. Sermeus and E. Lesaffre

    5. Application to Europe Nurse Forecasting Survey data

    In this section, we want to establish the relationship between the latent construct patient safety

    and six nurse-reported adverse events in the Belgian chapter of the Europe Nurse Forecasting

    Survey. Hereby we wish to control for nurse, nursing unit and hospital characteristics, i.e. nurses

    are seen as informants of not only their own patient safety record but also informants of patient

    safety culture in their nursing unit. The response of interest is a multivariate vector of nurse-

    reported adverse events, i.e.Yij=.Yij1, . . . ,Yij6/whereYij1 denotes wrong medication, or dosage

    or time,Yij2 denotes pressure ulcers,Yij3patient falls,Yij4 urinary tract infections,Yij5blood-

    stream infections andYij6 pneumonia.

    Since no a-priori-specified factor structure is proposed for the adverse events in this paper

    and patient safety, there is a need to explore the factor structure. So, to avoid double usage of

    the data, the data were equally divided into the learning and validation data sets by using simple

    random sampling (within a cluster). The factor structure was explored by using the learning

    data set and the validation data set was used in confirming the factor structure. The learning

    data set was then reused in the model assessment for the CFA model.

    EFA was implemented by using a frequentist approach by making use of MPLUS since

    the Bayesian approach is computationally intense. The Bayesian module of MPLUS is not

    sufficiently flexible to allow for the computations of the discrepancy measures that are proposed

    in this paper; hence CFA models were fitted in JAGS by usingR2jags.

    5.1. Descriptive statistics

    The descriptive statistics in this section pertain to the entire (learning plus validation) Belgian

    nurse survey data. The learning and validation data sets had similar characteristics (as they are

    random samples from the entire data set).

    Table 1 shows the percentage of nurse-reported adverse events, the number of missingresponses and ICC. From Table 1, there are a few missing values for all the responses. The most

    common nurse-reported adverse event is urinary tract infection, followed by wrong medication,

    or time or dose and the least reported is patient fall accidents. The ICC-values are all greater

    than 0.1, indicating the possible need for a multilevel FA model.

    The Spearman correlation coefficient matrix of the percentages in a nursing unit of nurse-

    reported adverse events shows that there were moderate-to-high correlations between the

    nurse-reported adverse events. The largest correlation was between bloodstream infections and

    pneumonia and the smallest correlation was between wrong medication, or time or dose and

    pneumonia:

    Table 1. Belgian nurse survey data:percentage of nurse-reportedadverseeventsand ICC

    Type of adverse event %adverse event Number missing ICC

    Wrong medication, dose or time 20.55 70 0.11Pressure ulcers 11.60 98 0.13Falls 9.08 89 0.19Urinary tract infection 21.54 110 0.21Infection 9.66 160 0.18Pneumonia 11.40 151 0.27

    Calculated for the non-missing responses.

  • 8/13/2019 Multilevel Factor Analytic Models

    11/21

    Multilevel Factor Analytic Models 247

    University hospitals were more likely to have frail patients compared with the non-university

    hospitals. Thus this might lead to a higher incidence of nurse-reported adverse events. University

    hospitals had higher numbers of nurse-reported adverse events than non-university hospitals

    except for wrong medication, or dose or time and patient fall accidents.

    5.2. Factor analysis

    In this section we present the results from FA. All EFA models (single level and multilevel) were

    implemented on the learning data set by using a frequentist approach in MPLUS. The use of

    a frequentist approach here was necessitated by the fact that methods for Bayesian covariance

    selection approaches are scarce in the literature, no software is available and these methods are

    computationally intense. Since we are assuming latent Gaussian responses and the responses are

    binary a probit link was assumed and parameter estimates were obtained by using the maximum

    likelihood approach.

    The Bayesian approach was used to fit all CFA models (single level and multilevel). In

    particular the latent variable approach was used, which enabled the easy computation ofMANOVA discrepancy measures which are instrumental in detecting hierarchical structures

    in the data.

    5.2.1. Single-level factor analysis

    5.2.1.1. Single-level exploratory factor analysis. Here we explored the factor structure, i.e.

    the number of factors and the outcomes on which the chosen factors load, by using EFA in

    MPLUS. It is customary in EFA that one starts by aiming for a simple factor structure and

    on doing a CFA modify the structure accordingly. Fig. 1(a) shows a scree plot of the single-

    level EFA. The scree plot indicates that two factors were adequate to account for much of thevariability in the six responses.

    The two-factor solution was further supported by the results shown in Table 2, which gives

    measures of fit for various factor structures. The measures of fit are all within acceptable limits;

    see Section 4.2. Models with more than three factors were not identifiable as the number of

    parameters to estimate exceeded the available information. We also looked at the factor loadings

    to assess their classical statistical significance. On the basis of these results and on Fig. 1(a)

    and Table 2 we concluded that a two-factor solution was a reasonably good single-level factor

    structure for the Belgian nurse survey data.

    Recall that, in Section 2, the responses were generated by discretizing the Likert-scaled vari-

    ables by using 2 as a cut-off point. We carried out a sensitivity analysis by choosing other cut-off

    points, i.e. 1, 3 and 4, and redoing the EFA. The proposed factor structures for all these cut-off

    points were similar to the factor structure that is presented here. We also explored the factor

  • 8/13/2019 Multilevel Factor Analytic Models

    12/21

    248 L. Diya, B. Li, K. Van den Heede, W. Sermeus and E. Lesaffre

    1 2 3 4 5 6

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    3.5

    Number of factors

    Variance

    (Eigenvalue)

    1 2 3 4 5 6

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    3.5

    Number of factors

    Variance

    (Eigenvalue)

    (a)

    (b)

    Fig. 1. Belgian nurse survey datascree plots for (a) the single-level and (b) the multilevel EFA: ,within; , between

  • 8/13/2019 Multilevel Factor Analytic Models

    13/21

    Multilevel Factor Analytic Models 249

    Table 2. Belgian nurse survey data: measures of fit for single-level EFA

    Number of 2 p-value CFI TLI RMSEA SRMRfactors

    1

  • 8/13/2019 Multilevel Factor Analytic Models

    14/21

    250 L. Diya, B. Li, K. Van den Heede, W. Sermeus and E. Lesaffre

    Table 3. Belgian nurse survey data:PPP-values for discrepancy measuresto diagnose hierarchical structures

    Discrepancy PPP-value

    LH 0.03PT 0.03WL 0.97

    F1 0.14F2 0.58F3 0.12F4 0.12F5 0.66F6 0.29

    include in the mean structure of the multilevel FA model. A covariate was selected for inclusion

    in the FA model if it was significant at the 5% level for at least two adverse events. From this

    exercise the following variables were considered:

    (a) age,

    (b) gender,

    (c) satisfied,

    (d) unit type (medical or mixed, with surgical being the default),

    (e) university hospital and

    (f) beds.

    For computational reasons the continuous covariates were centred and standardized (SD=1).

    5.2.2.1. Multilevel exploratory factor analysis. Using MPLUS a frequentist two-level EFA

    was fitted to the learning data set. Here, the observational level consists of nurses who are

    nested in the nursing unit (cluster level). Fig. 1(b) shows the scree plot for the multilevel factor

    analysis. The scree plot indicates that two factors are needed at each level. The within-factor

    structure remained the same as that suggested by the single-level EFA but, for the between-

    factor structure, the factor indicators wrong medication, or time or dose and pressure ulcers

    loaded on the first between-nursing unit factor whereas the remainder of the factor indicators

    loaded on the second between-nursing unit factor, as shown in Fig. 2.

    The two-factor solution was further supported by the measures of fit for the two-factor solu-tion at both levels. The 2 goodness-of-fitp-value was 0.5887, RMSEA (and the 95% confidence

    interval) is given by 0.000 (0.000; 0.026), both CFI and TFI had a value of 1 and the within

    and between SRMR are 0.018 and 0.057 respectively. Hence all the goodness-of-fit indices were

    within acceptable limits. These goodness-of-fit statistics were better than for any other factor

    solution.

    5.2.2.2. Multilevel confirmatory factor analysis. To validate the factor structure pro-

    posed (from multilevel EFA) we fitted a two-level (nurse and nursing unit levels) CFA model,

    controlling for nurse, nursing unit and hospital characteristics, by using the data augmentation

    approach in JAGS. We assumed similar priors to those for the single-level FA in Section 5.2.1.

    With regard to the variance components for the random effects we used parameter expansion,

  • 8/13/2019 Multilevel Factor Analytic Models

    15/21

    Multilevel Factor Analytic Models 251

    Fig. 2. Belgian nurse survey data: multilevel factor structure

    i.e. we had a gamma distribution with scale and shape parameters equal to 0.5 as the prior for

    the precision and a normal distribution with mean 0 and variance 106 for the parameter ex-

    pansion term. The variances of the random effects were obtained through dividing the squared

    parameter expansion term by the precision.

    We considered three chains, each having 300000 MCMC iterations (with a burn-in of 50000

    iterations). We thinned the chains by selecting one in 10 iterations. Hence, values in Table 4 are

    based on 25000 sampled values. (This is so for all the models in this section, including models

    used for sensitivity analysis.) The Gelman, Brooks and Rubins diagnostics for all parameters

    were less than 1.1 and the MCMC chains mixed well, hence indicating no convergence problems.

    Table 4 shows the posterior means (with SDs in parentheses) and the 95% credible intervals forthe factor loadings and the variance components of the second-level unique factors (random

    effects).

    Older nurses reported fewer adverse events than younger nurses, except for wrong medication,

    or time or dose. Male nurses reported more incidences of wrong medication, or time or dose

    than their female counterparts. Nurses who were satisfied with their career choice reported

    fewer patient fall accidents and incidences of wrong medication, or time or dose. Flemish-

    speaking nurses reported more incidences of urinary tract infections, bloodstream infections

    and pneumonia than French-speaking nurses. There were no statistically important differences

    between the reported adverse events in the surgical, medical and mixed units.

    The factor loadings that are presented here were standardized and can be interpreted as

    the correlation between the factor domain and the factor indicators (responses), i.e. the factor

    loadings when the total variability (per response) is fixed to 1. All the posterior means of

  • 8/13/2019 Multilevel Factor Analytic Models

    16/21

    252 L. Diya, B. Li, K. Van den Heede, W. Sermeus and E. Lesaffre

    Table 4. Belgian nurse survey data: results of the multilevel CFA model

    Parameter Posterior 95% crediblemean (SD) interval

    Within-level factor loadings.1/1,1

    0.407 (0.053) (0.300; 0.508)

    .1/1,2

    0.947 (0.025) (0.889; 0.986)

    .1/1,3 0.557 (0.054) (0.449; 0.659)

    .1/2,4

    0.828 (0.035) (0.756; 0.892)

    .1/2,5 0.853 (0.037) (0.777; 0.921)

    .1/2,6

    0.804 (0.040) (0.722; 0.876)

    Between-level factor loadings

    .2/1,1

    0.182 (0.098) (0.014; 0.375)

    .2/1,2 0.224 (0.100) (0.023; 0.403).2/2,3

    0.405 (0.102) (0.183; 0.574)

    .2/2,4

    0.226 (0.072) (0.082; 0.363)

    .2/2,5

    0.323 (0.067) (0.190; 0.450)

    .2/2,6

    0.381 (0.090) (0.201; 0.546)

    Residual variances

    21

    0.117 (0.063) (0.015; 0.256)

    22 1.117 (2.035) (0.012; 8.066)

    23 0.226 (0.191) (0.008; 0.715)

    24

    1.652 (1.059) (0.453; 4.531)

    25 0.199 (0.226) (0.008; 0.824)26 0.919 (0.724) (0.045; 2.786)

    Fit indices

    2 0.349 (0.477) (0.000; 1.000)MSPE 17206.897 (262.432) (17030.075; 17724.202)

    .v/k,s is thekth standardized factor loading for variable s at level v.

    2s is

    the variance component for variable s. s: 1 denotes the wrong medication,or time or dose; 2 denotes pressure ulcers; 3 denotes patient falls; 4 denotesurinary tract infection; 5 denotes bloodstream infection; 6 denotes pneumoniaresponse.

    the first-level factor loadings were greater than 0.4, implying that there was moderate-to-high

    correlation between the factor domains and the factor indicators. The nursing unit level factor

    loadings were between 0.18 and 0.41. Though these factor loadings were low to moderate they

    were still statistically important. The 2 goodness-of-fit Bayesianp-value was 0.50, suggesting

    that there was no major discrepancy between the data and the model. The model MSPE was

    17206.897.

    A sensitivity analysis was done by

    (a) altering the common factor distribution to have a Studentt-distribution with 4 degrees

    of freedom,

    (b) altering the common factor distribution to have at-distribution with unspecified degrees

    of freedom and

  • 8/13/2019 Multilevel Factor Analytic Models

    17/21

    Multilevel Factor Analytic Models 253

    (c) fixing the first factor loading instead of the factor variance to 1.

    The three models resulting from these alterations had larger MSPE-values than the initial

    multilevel FA model, i.e. the MSPE-values were 17209.677, 17207.970 and 17207.180. We also

    assessed whether there was substantial variability at the hospital level after accounting for the

    nursing unit level. All the MANOVA discrepancy measures had values that were close to 0.5,hence indicating that there was not that much dependence between nurses in a hospital after

    accounting for the nursing unit. This was used as evidence for not extending the FA model to

    three levels (hospital level, nursing unit level and nurse level). Hence the effect of hospital was

    explored in only a non-hierarchical way.

    The within-factor structure was the same as that attained under the single-level FA. The

    nurse-reported adverse events wrong medication, or time or dose, pressure ulcers and patient

    fall accidents loaded on one factor whereas the remaining nurse-reported adverse events loaded

    on another factor. This implied that nurses who reported that they encountered many incidences

    of wrong medication, or time or dose were also more likely to report high numbers of pressure

    ulcers and patient fall accidents (adhering to protocol and safety guidelines) whereas nurses

    who reported that they encountered many incidences of urinary tract infections were also more

    likely to have reported many incidences of bloodstream infections and pneumonia (hygiene or

    hand hygiene). These three adverse events can also be considered as indicators of adherence to

    safety guidelines, for instance, safe handling of perfusion lines (which if not done properly can

    be a source of infection). However, these three adverse events are not as directly related to the

    nursing care process as the adverse events which load on the first factor. We should note that the

    nurses here were informants of not only their own experience but also that of the nursing unit

    at large. The between-factor structure was reflective of what happens at the nursing unit level.

    Wrong medication, or time or dose and pressure ulcers loaded on the first nursing unit factor,

    and the remainder of the adverse events loaded on the second nursing unit factor. Clearly the

    first factor is an indicator of whether the nursing unit administers medication correctly, in time

    and is attentive to the physical need of immobile patients. The second factor is a mixture of,

    for example, whether the right equipment is accorded to the patient and the general hygiene

    condition of the nursing unit.

    6. Discussion

    In this paper we have illustrated how to address complex research questions by using FA. FA

    models are used to relate the latent constructs to the responses of interest, e.g. adverse events.

    These models are becoming popular in many fields, especially in health outcomes and nursingresearch where they are used to inform policy decisions. The data that are collected in health

    outcomes and nursing research are often huge and for this reason, coupled with limitations

    in available software, health outcome and nurse researchers tend to fit single-level FA models

    based on simplistic assumptions, e.g. the independence assumption. However, most data in this

    area of research are multilevel in structure; hence neglecting the data structure in the statistical

    analysis can lead to invalid inferences which might lead to wrong policy recommendations.

    Before embarking on complex multilevel FA modelling which is computationally intense there

    should be an assessment of whether there is substantial dependence between subjects in the

    same cluster. If the dependence between subjects in the same cluster is low the inference that is

    obtained from the single-level FA is more likely to be valid, thus leading to simple conclusions,

    i.e. we aim for parsimony, not only in the covariates that are included in these models but also in

    the level of analysis. However, if there is substantial between-cluster variability, inferences from

  • 8/13/2019 Multilevel Factor Analytic Models

    18/21

    254 L. Diya, B. Li, K. Van den Heede, W. Sermeus and E. Lesaffre

    the single-level FA will be invalid because the level of analysis has a bearing on the interpretation

    of the factor structure and the inferences coming from such a structure.

    We proposed the use of the MANOVA discrepancy measures to address formally whether

    there is substantial dependence (between-cluster variability) in our multivariate multilevel struc-

    tured data set. These measures are handy as they have more power than the ANOVA discrepancy

    measures and intraclass correlation coefficients to diagnose the hierarchical structures. It is cru-cial that the factor structure is correctly specified when diagnosing the hierarchical structure

    as misspecification of the factor structure implies that the discrepancy measure proposed will

    then become sensitive not only to the hierarchical structure but also to the misspecified fac-

    tor structure. Though it can be argued that all potential models, i.e. single-level and multilevel

    models, can be fitted and then compared, for large data sets that are encountered in health

    outcomes, nursing and epidemiological research (e.g. healthcare registers or healthcare admin-

    istrative databases) this entails a huge computational burden. Instead a single-level FA model

    can be fitted and from this model MANOVA discrepancies used to assess whether there is a need

    to do multilevel FA. The MANOVA discrepancy measures that are presented in this paper are

    not limited to FA models only but can also be used when other types of multivariate model areunder consideration. The MANOVA concepts can also be used in a frequentist context where

    MANOVA test statistics can be used. A potential area for further research is the development

    of diagnostic measures to check omitted non-hierarchical multilevel structures, for instance in

    multiple-membership and cross-classified models.

    One can easily implement simple FA models, using either the frequentist approach or the

    Bayesian approach, in MPLUS. However, with regard to the Bayesian approach, the MPLUS

    Bayesian module is quite limited, i.e. MPLUS cannot fit multilevel FA models with more than

    two levels and it is not feasible to implement the latent variable approach or to obtain dis-

    crepancy measures. In this paper the EFA employed the frequentist approach as the Bayesian

    approach (Lopes and West, 2004) is computationally intensive. From a practical point of viewthe EFA is easy to implement in a frequentist context compared with the Bayesian context. Even

    researchers with limited methodological comprehension can implement EFA in a frequentist

    context. The Bayesian approach was adopted in fitting CFA models because it offers flexibil-

    ity. It is fairly easy to deviate from common assumptions when using the Bayesian approach

    compared with the frequentist approach; for example cross-loadings can be given priors con-

    centrated around 0 instead of fixing them to 0. The Bayesian approach in this paper, the latent

    variable approach, offers additional advantages over using link functions (e.g. logit and probit

    links) in that distributional assumptions can be relaxed; for example a truncated skewed normal

    distribution can be used and various discrepancy measures for continuous responses can be

    used for discrete data.The PPP-values from the 2 discrepancy measures for both the single-level and the multilevel

    FA suggested that the two models were good fits for the data at hand. It is clear that this

    discrepancy measure failed to capture the misspecification of the data structure. Relying on

    global goodness-of-fit measures without looking at specific aspects of the model can lead to

    the wrong inference with regard to the compatibility of the model to the data. Thus, various

    measures should be considered. PPCs use the data twice, i.e. in fitting the model and in checking

    model adequacy, hence leading to some degree of bias. To eliminate this bias cross-validation

    PPCs (Marshall andSpiegelhalter, 2003)canbe used; however, they are computationally intense.

    When a model is correctly specified, the PPP-values should have a uniform distribution. This

    characteristic of the PPP-values has been discussed elsewhere (Hjortet al., 2006; Steinbakk and

    Storvik, 2009). Hjort et al. (2006) proposed recalibrating the PPCs, resulting in recalibrated

    PPP-values which have a uniform distribution under the correct model specification. However,

  • 8/13/2019 Multilevel Factor Analytic Models

    19/21

    Multilevel Factor Analytic Models 255

    the PPP-values in this paper are not recalibrated as computing the recalibrated PPP-values is

    extremely computer intensive, requiring a double-simulation approach. So the data were split

    into a learning data set and a validation data set, where the model fitting is done by using the

    learning data set and model assessment by using the validation data set. Another potential

    approach which is less computationally demanding is to compute the calibrated PPP-values by

    using integrated nested Laplace approximations (Rue et al., 2009). This is an area which theauthors are currently exploring.

    The factor structure for the single-level FA and the nurse level of the multilevel FA is in line

    with expectations. Indeed, medication errors, pressure ulcers and patient fall accidents form

    a logical group of adverse events. Belgian hospitals traditionally focus on these three adverse

    events when setting up actions to improve patient safety in which nurses are involved. After all,

    actions to prevent these adverse events are closely related to the nursing process (for example,

    turning patients to prevent pressure ulcers, most medication is administered by nurses and

    patient fall accidents can be prevented if patients are monitored and wear the correct type of

    shoes). Urinary tract infections, bloodstream infections and pneumonia are all infections that

    are potentially related to nursing care (for example, urinary tract infections can be prevented bypatients not having in-dwelling catheters for longer than needed). These can also be explained

    by the severity of illness of the patients as well as by medical interventions (e.g. sterile insertion

    central venous catheters). At the nursing unit level the variable patient fall accidents is now

    grouped with the infections. The first two adverse events clearly reflect the nursing process at

    the nursing unit level. The second nursing unit factor consists of patient fall accidents and

    infections. This factor might be indicative of collective efforts taken at the nursing unit to

    prevent injury (falls and infections) to patients, i.e. making sure that the environment that a

    patient is exposed to is free from conditions which might increase the risk of infections or

    patient fall accidents (e.g. clearing spillages from the floor or dropped food, clearing medical

    equipment from nursing units and picking up dropped linen). However, the authors advocatefurther research to comprehend the grouping of factor indicators at the nursing unit level and

    also to explore whether this structure is unique to Belgium or whether it can be established

    in other European countries. As a sensitivity analysis the responses were discretized by using

    different cut-off points. The results that are reported here were robust to changes in the cut-off

    points.

    Acknowledgements

    The research leading to these results has received funding from the European Unions seventh

    framework programme (FP7/2007-2013) under grant agreement 223468. The first and last

    authors were partially supported by the Interuniversity Attraction Poles program P6/03, Belgian

    State Federal Office for Scientific, Technical and Cultural Affairs.

    References

    Agency for Healthcare Research and Quality (2004) AHRQs patient safety initiative: building foundations,reducing risk.Interim Report 04-Rg005.Agency for Healthcare Research and Quality, Rockville.

    Albert, J. H. and Chib, S. (1993) Bayesian analysis binary and polychotomous response data. J. Am. Statist. Ass.,88, 669679.

    Ansari, A. and Jedidi, K. (2000) Bayesian factor analysis for multilevel binary observations. Psychometrika, 65,475496.

    Azevedo, C. L., Bolfarine, H. and Andrade, D. F. (2011) Bayesian inference for a skew-normal IRT model under

    the centred parameterization.Computnl Statist. Data Anal., 55, 353365.Bartholomew, D., Knott, M. and Moustaki, I. (2011) Latent Variable Models and Factor Analysis: a Unified

    Approach, 3rd edn, sect. 2.10, pp. 3037. Chichester: Wiley.

  • 8/13/2019 Multilevel Factor Analytic Models

    20/21

    256 L. Diya, B. Li, K. Van den Heede, W. Sermeus and E. Lesaffre

    Bayarri, M. J. and Castellanos, M. E. (2004) Bayesian checking of the second levels of hierarchical models. Statist.Sci., 22, 322343.

    Bentler, P. M. (1990) Comparative fit indices in structural models.Psychol. Bull., 107, 238246.Bliese, P. D. and Hanges, P. J. (2004) Being both too liberal and too conservative: the perils of treating grouped

    data as though they were independent. Organiznl Res. Meth., 7, 400417.Browne, M. W. and Cudeck, R. (1992) Alternative ways of assessing model fit. Sociol. Meth. Res., 21, 230258.Bruyneel, L., van Den Heede, K., Diya, L., Aiken, L. andSermeus, W. (2009) Predictive validity of theinternational

    hospital outcomes study questionnaire: an RN4CAST pilot study. J. Nursng Schol., 41, 202210.Celeux, G., Forbes, F., Robert, C. and Titterington, D. (2006) Deviance information criteria for missing data

    models.Am. Statistn, 1, 651674.Diya, L., Van den Heede, K., Sermeus, W. and Lesaffre, E. (2011) The relationship between in-hospital mortality,

    readmission into the intensive care nursing unit and/or operatingtheater and nurse staffing levels. J. Adv. Nursng,68, 10731081.

    Dyer, N. G., Hanges, P. J. and Hall, R. J. (2005) Applying multilevel confirmatory factor analysis techniques tothe study of leadership.Lead. Q., 16, 149167.

    Gajewski, B. J., Boyle, D. K., Miller, P. A., Oberhelman, F. and Dunton, N. (2010) A multilevel confirmatoryfactor analysis of the practice environment scale: a case study. Nursng Res., 59, 147153.

    Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. (2004)Bayesian Data Analysis. London: Chapman and Hall.Gelman, A., Meng, X. and Stern, H. (1996) Posterior predictive assessment of model fitness via realized discrep-

    ancies.Statist. Sin., 6, 733807.

    Goldstein, H. and Browne, W. J. (2005) Multilevel factor analysis models for continuous and discrete data.In Contemporary Psychometrics: a Festschrift for Roderick P. McDonald (eds A. Maydeu-Olivares and J. J.McArdle), pp. 453475. London: Erlbaum.

    Goldstein, H. and McDonald, R. P. (1988) A general model for the analysis of multilevel data. Psychometrika,53, 455467.

    Grilli, L. and Rampichini, C. (2007) Multilevel factor analysis for ordinal variables. Struct. Equn Modlng, 14,125.

    Hjort, N. L., Dahl, F. A. and Steinbakk, G. H. (2006) Post-processing posterior predictive p values.J. Am. Statist.Ass., 101, 11571174.

    Hochreiter, S., Clevert, D.-A. and Obermayer, K. (2006) A new summarization method for affymetrix probe leveldata.Bioinformatics, 22, 943949.

    Hu, L. and Bentler, P. (1999) Cutoff criteria for fit indexes in covariance structure analysis: conventional criteriaversus new alternatives.Struct. Equn Modlng, 6, 155.

    Johnson, R. A. and Wichern, D. W. (2002)Applied Multivariate Statistical Analysis. Upper Saddle River: Pearson

    Education International.Kenny, D. A. and Judd, C. M. (1986) Consequences of violating the independence assumption in analysis of

    variance. Psychol. Bull., 99, 422431.Lake, E. T. (2002) Development of the practice environment scale of the nursing work index. Res. Nursng Hlth,25, 176188.

    Lawley, D. N. and Maxwell, A. E. (1971)Factor Analysis as a Statistical Method, 2nd edn. London: Butterworth.Longford, N. T. (1993)Random Coefficient Models. New York: Oxford University Press.Longford, N. T. and Muthen, B. O. (1992) Bayesian posterior predictive checks for complex models. Psychomet-

    rika, 57, 581597.Lopes, H. F. and West, M. (2004) Bayesian model assessment in factor analysis. Statist. Sin., 14, 4167.Mardia, K. V., Bibby, J. M. and Kent, J. T. (1979) Multivariate Analysis. New York: Academic Press.Marshall, E. C. and Spiegelhalter, D. J. (2003) Approximate cross-validatory predictive checks in disease mapping

    models.Statist. Modllng, 22, 16491660.

    Muthen, L. K. and Muthen, B. O. (2011)Mplus Users Guide, 6th edn. Los Angeles: Muthen and Muthen.Naicker, V. (2010) Educators pedagogy influencing the effective use of computers for teaching purposes in class-rooms: lessons learned from secondary schools in South Africa. Educ. Res. Rev., 5, 674689.

    Norris, M. and Lecavalier, L. (2010) Evaluating the use of exploratory factor analysis in developmental disabilitypsychological research.J. Autsm Develpmntl Disordrs, 40, 820.

    Plummer, M. (2003) JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. In Proc.3rd Int. Wrkshp Distributed Statistical Computing, Vienna, Mar. 20th22nd (eds K. Hornik, F. Leisch andA. Zeileis).

    Plummer, M., Best, N., Cowles, K. and Vines, K. (2006) Coda: convergence diagnosis and output analysis formcmc.R News, 6, 711.

    R Development Core Team (2012)R: a Language and Environment for Statistical Computing. Vienna: R Founda-tion for Statistical Computing.

    Robinson, W. S. (1950) Ecological correlations and the behavior of individuals. Am. Sociol. Rev., 15, 351357.Rue, H., Martino, S. and Chopin, N. (2009) Approximate Bayesian inference for latent Gaussian models by using

    integrated nested Laplace approximations (with discussion). J. R. Statist. Soc.B, 71, 319392.Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and van der Linde, A. (2002) Bayesian measures of model complexity

    and fit (with discussion).J. R. Statist. Soc.B, 64, 583639.

  • 8/13/2019 Multilevel Factor Analytic Models

    21/21

    Multilevel Factor Analytic Models 257

    Steiger, J. H. and Lind, J. C. (1980) Statistically based tests for the number of common factors. A. Spring Meet.Psychometric Society, Iowa City.

    Steinbakk, G. H. and Storvik, G. O. (2009) Posterior predictive p-values in Bayesian hierarchical models.Scand.J. Statist., 36, 320336.

    Su, Y.-S. and Yajima, M. (2012) R2jags: a package for running JAGS from R.R Package Version 0.03-06.Tucker, C. and Lewis, C. (1973) A reliability coefficient for maximum likelihood factor analysis. Psychometrika,38, 110.

    Van den Heede, K. (2008) Nurse-staffing levels and patient safety in acute hospitals: analyzing administrativedata at the nursing unit level. PhD Thesis. Centre for Health Services and Nursing Research, Department ofPublic Health, Katholieke Universiteit Leuven.

    Yan, G. and Sedransky, J. (2007) Bayesian diagnostic techniques for detecting hierarchical structure.Baysn Anal.,2, 735760.

    Supporting informationAdditional supporting information may be found in the on-line version of this article:

    Supplementary Material: Multilevel factor analytic models for assessing the relationship between nurse reportedadverse events and patient safety.