supp.apa.orgsupp.apa.org/psycarticles/supplemental/a0035889/CCP-CCP3... · Web viewMarkov Chain Monte Carlo (MCMC) simulation generates a chain of sequentially dependent draws from

Supplemental Materials

Empirical Bayes MCMC Estimation for Modeling Treatment Processes, Mechanisms of

Change, and Clinical Outcomes in Small Samples

by Timothy J. Ozechowski, 2014, Journal of Consulting and Clinical Psychology

http://dx.doi.org/10.1037/a0035889

S1. MCMC Posterior Simulation

Markov Chain Monte Carlo (MCMC) simulation generates a chain of sequentially

dependent draws from known densities resembling the posterior but which are simpler from

which to draw samples. In one popular MCMC approach, known as Gibbs sampling, complex

joint posterior distributions are decomposed into their constituent conditional distributions.

Values for each parameter are sampled from the corresponding posterior conditioned on the

values of the other parameters. An alternative MCMC sampling algorithm, known as Metropolis-

Hastings, draws samples from a known density referred to as the proposal distribution, p(θ),

which has mass over the same range as the posterior and from which it is relatively

straightforward to sample. A given draw from p(θ) is admitted to the target distribution t(θ) (i.e.,

the posterior) based on an acceptance/rejection rule. Specifically, at a given point s in the

MCMC sequence, a draw from the proposal distribution θ* is admitted to t(θ) if the proportion of

values equal to θ* already admitted to t(θ) is greater than the proportion of values in t(θ) equal to

the most recently admitted draw θs-1. If so, then θ* is admitted to t(θ) and becomes θs. Otherwise,

θ* is rejected and θs-1 is replicated in t(θ) and becomes θs. Both the Gibbs and Metropolis-

Hastings algorithms should be allowed to iterate until stationarity is achieved in the Markov

chain, which is a necessary condition for the simulated posterior to converge to the true

posterior.

S2. Demographic and Clinical Characteristics of the Sample

Participants in the original clinical trial were 120 adolescents and their families residing

in greater Albuquerque, New Mexico. The adolescent sample was 80% male and 20% female.

The majority of adolescents in the sample identified themselves as being of either Hispanic





http://dx.doi.org/10.1037/a0035889

(46.7%) or Anglo (38.3%) ethnic origin. The remaining 15% identified as either Native

American or having a mixed ethnic background. The mean age of the adolescent sample was

15.6 years (SD = 1.0). Participants were referred to the research clinic to receive outpatient

adolescent substance abuse treatment. Adolescents meeting official diagnostic criteria for

substance abuse or dependence were eligible to participate in the study. The study focused on

illicit substance abuse. As such, adolescents whose primary substance of abuse was alcohol or

tobacco were excluded from participation.

Marijuana was the primary substance of abuse among the adolescents in the parent

clinical trial. On average, adolescents reported using marijuana on 56.8% (SD = 31.8) of the past

90 days prior to treatment intake. Use of any substance other than tobacco was reported on an

average of 60% (SD = 31.0) of the past 90 days prior to treatment intake. In addition to substance

abuse, adolescent participants exhibited substantial comorbid emotional and behavioral

problems. Specifically, 70.8% of adolescents were rated at or above the borderline clinical

threshold for delinquent behavior problems based on parent reports on the Child Behavior

Checklist (CBCL; Achenbach, 1991). Moreover, 30% of the adolescent sample scored at or

above the threshold for clinical depression on the Beck Depression Inventory (BDI; Beck &

Steer, 1987) based on adolescent-specific guidelines for the BDI specified by Roberts,

Lewinsohn, and Seeley (1991).

S3. Coder Training

All training activities were conducted by the author of this article. The initial components

of the training process focused on didactic study of the principles and practices of Functional

Family Therapy (FFT) as well as of the Functional Family Therapy Coding and Rating Scale





http://dx.doi.org/10.1037/a0035889

(FFT CARS) coding manual. The coders practiced using the FFT CARS to code written therapy

transcripts initially and then were trained to code digitized video files of FFT sessions.

Videotapes of FFT sessions were converted to digital MPEG files using a digital video data

acquisition and management system. Coders used a specially designed Windows-based graphical

user interface to access the MPEG files, which were stored within a relational database on a local

server. The coders viewed the digital video files on a computer workstation monitor and entered

codes in real time for each discernible therapist intervention using a numerical keypad. Time-in

and time-out codes were automatically associated with each coded entry. At the end of each

observed FFT session, the collection of video segments and associated codes was stored as an

ASCII text file that could be readily imported into any commonly used statistical software

package for analytic purposes. Each week during the training period, the coders independently

observed and coded a video-recorded FFT session from an archive of recordings designated for

training purposes. Each week, the trainer computed rates of agreement between the two coders

on each of the FFT CARS intervention and context codes. The training continued until the coders

were able to exhibit 70% agreement on all codes for four consecutive weeks. The training

process lasted approximately four months.

S4. Positive Definite Posterior Covariance Matrices and the Cholesky Factorization

In general, a matrix of numerical elements is said to be positive definite if it is square,

symmetrical, and all its eigenvalues are greater than zero (Leon, 2009). Briefly, an eigenvalue is

the variance associated with a principal component, which is a weighted combination of

observed variables capturing a unique portion of the overall variance—similar to a factor in

factor analysis (Wothke, 1993). A non-positive eigenvalue is a “red flag” signaling that the

corresponding weighted combination of variables has a zero variance, which often is attributable





http://dx.doi.org/10.1037/a0035889

to high degrees of collinearity or redundancy between variables. Non-positive definite

covariance matrices are problematic mathematically because they cannot be inverted due to

division by zero. In the ML estimation setting, a non-positive definite input covariance matrix,

therefore, precludes performing the matrix algebra required to optimize the likelihood function

and obtain model parameter estimates. Likewise, if the estimated or model-implied covariance

matrix in an SEM analysis is non-positive definite, it cannot be shown that the observed data are

plausible given the model (i.e., the statistical model cannot be shown to fit the data). From a

Bayesian perspective, non-positive definite posterior covariance matrices raise suspicions

regarding the validity of the model as a plausible representation of processes by which the data

were produced.

If a given SEM covariance matrix M is positive definite, then the Cholesky factorization

may be obtained such that M = L∙LT where L is the lower triangular portion of M and LT is the

transpose of L (i.e., a re-expression of L with the columns and rows reversed). If the Cholesky

factorization of M can be computed, then M is positive definite. If the Cholesky factorization of

M cannot be computed, then M is not positive definite.

In cases where the Cholseky factorization of M cannot be obtained, the source of the non-

positive definiteness may be isolated by decomposing M into smaller submatrices (e.g., all 2 2

submatrices comprising M) and implementing the Cholesky factorization on each submatrix.

Submatrices for which the Cholesky factorization fails would be indicative of parameters in the

SEM that may be misspecified. Careful respecification of such parameters would most likely

rectify the non-positive definiteness in M.

S5. Obtaining R̂ Using the SAS NLMIXED Procedure





http://dx.doi.org/10.1037/a0035889

Although PROC MCMC does not compute the R̂ index, R̂ may be computed using the

NLMIXED procedure within SAS/STAT software package. First, the posterior samples from

each MCMC chain must be output to a unitary SAS data set in which all chains are “stacked”

vertically. Next, a random-intercept-only model may be fit to this stacked data set using the

NLMIXED procedure, with “chain” being the clustering, or Level 2, unit and the posterior draws

within each chain for a given parameter constituting the Level 1 observations. The NLIMXED

procedure automatically computes W and B as parameters of the random-intercept-only model.

The value of R̂ then may be computed by entering the computational formula for R̂ into the

ESTIMATE statement in NLIMXED. For a large number of parameters, the execution of this

procedure may be automated by embedding the NLMIXED code within a SAS macro program

(see Figure S3).

S6. Assessment of Model Fit Based on the Posterior Predictive Distribution

Typically, comparisons between the observed data and predicted values sampled from the

PPD are based on test statistics, or scalar summaries computed from samples of observations

(see Gelman et al., 2004, p. 162; Gelman & Meng, 1996, p. 197). For continuous normally

distributed observed variables, the most efficient summary test statistics are the sample mean and

standard deviation. The median, mode, minimum, and maximum values may be utilized as test

statistics as well. In a Bayesian assessment of model fit, a given test statistic based on the

observed sample data, T(yobs), may be compared to a set of corresponding test statistics computed

from R simulated samples or replications drawn from the PPD, T(yr), where r = 1, . . ., R. The

most straightforward way of comparing T(yobs) and T(yr) is to plot a histogram of the T(yr) values

and pinpoint the location of T(yobs) on this histogram. If the model under investigation exhibits a

good fit to the data from a Bayesian point of view, then the value of T(yobs) would be expected





http://dx.doi.org/10.1037/a0035889

for fall near the center of the histogram of T(yr) values, suggesting that the observed data are

highly plausible given the Bayesian posterior parameter estimates. Alternatively, one may

compute the proportion of T(yr) values that are equal to or greater than T(yobs), that is, Pr [T(yr) ≥

T(yobs)]. This proportion, known as the posterior predictive p value (ppp), expresses the

probability that the predicted values derived from the Bayesian estimates of the model

parameters are more extreme than the observed data. For a good-fitting model ppp would be

expected to equal approximately 0.5, again indicating that T(yobs) falls in the center of the

distribution of predicted test statistics T(yr) and that the observed sample data are highly

plausible given the Bayesian estimates of the model parameters (see Muthén & Asparouhov,

2012).

A Robust Assessment of Model Fit Given a Small Sample

When sample sizes are prohibitively small, any given sample statistic T(yobs) may be

biased, which in turn may lead to distorted or incorrect assessments of model fit based on

comparisons between T(yr) and T(yobs). Therefore, rather than comparing T(yr) with a single point

estimate T(yobs) calculated from the sample data, in the current demonstration analysis, values of

T(yr) were gauged against the sampling distribution of T(yobs), which was obtained using

bootstrap resampling of the sample data (Efron & Tibshirani, 1993). Briefly, bootstrap

resampling entails randomly selecting observations with replacement from a given sample of size

n until B bootstrap samples of size n have been drawn using only the observations in the original

sample y1, …, yn. A given test statistic T(yb) may be computed for each of the B bootstrap

samples (b = 1, …, B). The collection of bootstrap test statistics simulates the sampling

distribution of T(yobs) with the expected value estimated as μ̂B=¿ 1B ∑

b=1

B

T ( yb) and variance





http://dx.doi.org/10.1037/a0035889

estimated as σ̂ B2= 1

B−1 ∙ ∑b=1

B

¿¿. The square root of the estimated variance of the bootstrap

sampling distribution, σ̂ B ,is a robust estimate of the standard error of T(yobs), which quantifies the

sampling variability associated with T(yobs). With regard to Bayesian assessment of model fit, a

robust estimate of the standard error of T(yobs) permits the specification of a (frequentist)

confidence interval within which values of T(yrep) may be regarded as being consistent with

T(yobs) and beyond which values of T(yrep) are likely to be discordant with T(yobs), that is, biased

due to model misspecification or lack of fit. Comparing values of T(yrep) against a bootstrap

“minimal bias” confidence interval for T(yobs) rather than a single point estimate robustifies the

assessment of model fit against bias and imprecision inherent in T(yobs) due to the small sample

size.

To specify a confidence interval defining a region of “minimal-bias” surrounding T(yobs),

a useful convention set forth by Schafer and colleagues holds that bias in a statistical estimate

becomes appreciable when it exceeds 50% of one standard error of the estimate (Collins,

Schafer, & Cam, 2001; Schafer & Kang, 2008). Using this guideline, a confidence interval

within which values of T(yr) may be regarded as minimally biased with regard to the observed

data may be specified as

I :¿¿ – 0.5∙σ̂ B¿ ≤ T(yr) ≤ ¿¿ + 0.5∙σ̂ B¿. (S1)

The expression in Equation S1 states that values of T(yr) between –0.5 and +0.5 estimated

standard errors from the estimated mean of the sampling distribution of T(yobs) are contained in

the minimal bias interval. Furthermore, a coverage probability P̂c for I may be estimated as the

proportion of values of T(yr) that are contained within I. A good-fitting model from a Bayesian

perspective would be expected to produce estimates of P̂c close to 1.0 for each variable in a





http://dx.doi.org/10.1037/a0035889

given statistical model, indicating that nearly all of the posterior predicted values are contained

in the minimal bias interval I. Values of P̂c substantially lower than 1.0 (i.e., less than 0.90)

would indicate that a sizable proportion of posterior predicted values are not contained in I,

suggesting the model does not fit the data well for a given variable.

In the current demonstration analysis, T(yobs) was chosen to be the sample mean for each

variable in the SEM analysis and T(yr) (r = 1, …, R) was a corresponding set of predicted means

based on R = 500 simulated samples from the PPD, which were generated by the PROC MCMC

program (see line 89 of the MCMC code presented in Figure S1). The decision to set R = 500

was informed by Gelman et al.’s (2004, p. 164) demonstration of the PPD for Bayesian model

checking in which 200 PPD samples were simulated, with justification that “we use only 200

draws (from the PPD) … to illustrate that a small simulation gives adequate inference for many

practical purposes” (p. 144). In the current demonstration analysis, setting R = 500 was well in

excess of Gelman et al.’s (2004) specification, thereby providing enhanced assurance that the

PPD contained sufficient information to evaluate model fit. Increasing R beyond 500, however,

may lead to marked increases in the computer processing time for the MCMC procedure and

therefore is not recommended.

As noted above, the sampling distribution of T(yobs) (i.e., the sample mean) for each

variable in the SEM was simulated by drawing B = 500 bootstrap samples from the raw data and

computing the mean of each bootstrap sample T(yb). Next, using the 500 bootstrap values of

T(yb), the parameters of the sampling distribution for T(yobs), μ̂B and σ̂ B , were calculated. The

interval of minimal bias I was then computed according to Equation S1 above. Finally, the

coverage probability P̂c was estimated as the proportion of T(yr) values (r = 1, . . ., 500)

contained within I. Table S3 presents T(yobs), μ̂B, σ̂ B, I, and P̂c for each observed variable in the





http://dx.doi.org/10.1037/a0035889

SEM. To summarize, values of P̂c were 0.95 or greater for 13 of the 14 measured variables in the

SEM. For the remaining variable (Mother’s FES Cohesion Score at Pre-Tx ) the value of P̂c was

0.81. Overall, these results indicate that the vast majority of test statistics T(yr) based on the 500

replicated samples of predicted values drawn from the PPD for each measured variable in the

SEM were within one half of a standard error of the mean of the corresponding bootstrap

sampling distribution. In accordance with the aforementioned guideline by Schaffer and

colleagues, the values of T(yr) contained within this interval were regarded as exhibiting strong

concordance with the observed data, suggesting a good fit of the SEM from a Bayesian

perspective.





http://dx.doi.org/10.1037/a0035889

References

Achenbach, T. M. (1991). Manual for the Youth Self-Report and 1991 Profile. Burlington, VT:

University of Vermont, Department of Psychiatry.

Beck, A. T., & Steer, R. A. (1987). Beck Depression Inventory manual. New York, NY:

Harcourt Brace Jovanovich.

Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive

and restrictive missing-data strategies in modern missing-data

procedures. Psychological Methods, 6, 330–351.

Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York, NY: Chapman

& Hall.

Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis (2nd ed.).

New York, NY: Chapman & Hall/CRC.

Gelman, A., & Meng, X. (1996). Model checking and model improvement. In W. R. Gilks, S.

Richardson, & D. J. Spiegelhalter (Eds.), Markov Chain Monte Carlo in practice (pp.

189–202). New York, NY: Chapman & Hall.

Leon, S. J. (2009). Linear algebra with applications (8th ed.). Upper Saddle River, NJ: Pearson

Prentice Hall.

Muthén, B., & Asparouhov, T. (2012). Bayesian structural equation modeling: A more flexible

representation of substantive theory. Psychological Methods, 17, 313–335.





http://dx.doi.org/10.1037/a0035889

Roberts, R. E., Lewinsohn, P. M., & Seeley, J. R. (1991). Screening for adolescent depression: A

comparison of scales. Journal of the American Academy of Child and Adolescent

Psychiatry, 30, 58–66.

Schafer, J. L., & Kang, J. (2008). Average causal effects from nonrandomized

studies: A practical guide and simulated example. Psychological

Methods, 13, 279–313.

Wothke, W. (1993). Nonpositive definite matrices in structural modeling. In K. A. Bollen & J. S.

Long (Eds.), Testing structural equation models (pp. 256–293). Newbury Park, CA:

Sage.





http://dx.doi.org/10.1037/a0035889

Table S1

FFT CARS Intervention Codes and Their Observed Frequencies and Percentages

Intervention code Frequency PercentageTreatment Focus 86 2.3Problem Focus 61 1.6Relabel 141 3.8Reframe 109 2.9Divert/Interrupt 13 0.3Behavioral Sequencing 21 0.6Seek Information 1,348 36.1Give Information 157 4.2Support 177 4.7Acknowledge/Clarify 1,026 27.5Challenge 108 2.9Disapprove 1 0.0In-Session Focus 74 2.0Structure/Direct 113 3.0Teach Skills 82 2.2Behavioral Rehearsal/ Role Play 15 0.4Assign/Review Homework 7 0.2Reinforce Skills 9 0.2Relapse Prevention 14 0.4Facilitate Community Resources 16 0.4Pacer/Prompt 13 0.3Humor 79 2.1Talk 48 1.3Incomplete 15 0.4Note: FFT CARS = Functional Family Therapy Coding andRating Scale.





http://dx.doi.org/10.1037/a0035889

Table S2

MCMC Convergence Indices for All SEM Parameters

Note. MCMC = Markov Chain Monte Carlo; SEM = structural equation modeling; r L50 = Lag-50 autocorrelation; MCSE = Monte Carlo standard error;

PSD = posterior standard deviation; W = within-chain variance; B = between-chain variance; R̂ =

Parameter r L50a MCSE/PSDa Wb,d Bc,d R̂

LY(6,5) 0.01 0.02 0.00 0.00 1.00

LY(12,9) 0.01 0.01 0.01 0.00 1.00

LY(13,10) –0.01 0.01 0.02 0.00 1.00

TE(1,1) –0.03 0.02 13,483.60 0.00 1.00

TE(2,2) 0.01 0.02 41,133.54 0.00 1.00

TE(3,3) 0.02 0.02 12.01 24.14 1.00

TE(4,4) –0.01 0.02 2.52 0.00 1.00

TE(5,5) –0.01 0.02 2.12 0.00 1.00

TE(6,6) –0.00 0.02 2.70 0.00 1.00

TE(7,7) –0.01 0.02 1.47 0.00 1.00

TE(8,8) –0.02 0.02 1.59 0.00 1.00

TE(10,10) –0.02 0.02 0.00 0.00 1.00

TE(11,11) 0.01 0.02 0.00 0.00 1.00

TE(12,12) –0.02 0.02 0.00 0.00 1.00

TE(13,13) 0.02 0.02 0.00 0.00 1.00

PS(1,1) –0.02 0.03 42,700.38 67,522.74 1.00

PS(2,2) –0.01 0.02 425,458.70

0.00 1.00

PS(3,3) –0.01 0.03 16.51 109.25 1.00

PS(4,4) 0.01 0.03 82.00 2,765.06 1.00

PS(5,5) 0.01 0.03 3.65 0.00 1.00

PS(7,7) –0.00 0.03 0.00 0.00 1.00

PS(8,8) 0.01 0.02 0.00 0.00 1.00

PS(9,9) –0.00 0.02 0.00 0.00 1.00

PS(10,10) 0.02 0.02 0.00 0.00 1.00

PS(11,11) –0.01 0.02 0.00 0.00 1.00

BE(2,7) –0.00 0.01 42.02 259.65 1.00

BE(4,7) 0.00 0.02 0.64 0.00 1.00

BE(7,8) 0.01 0.02 5.39 0.00 1.00

BE(7,9) –0.03 0.03 32.04 246.33 1.00

BE(7,10) 0.03 0.07 226.95 4,902.47 1.00

BE(7,11) 0.01 0.02 7.85 0.00 1.00

AL(1) –0.05 0.04 45.28 62.51 1.00





http://dx.doi.org/10.1037/a0035889

Gelman-Rubin R-hat index; LY = lambda-y; TE = theta-epsilon; PS = psi; BE = beta; AL = alpha.

aBased on a single MCMC chain with default starting values equal to the prior mode. bWithin-chain sample size = 5,000. cNumber of chains = 7. dValues of W and B displayed as 0.00 are truncated to two decimals because of table formatting restrictions; the actual values are greater than zero.

EMPIRICAL BAYES MCMC ESTIMATION 15

Table S3

Observed Mean and Standard Deviation, Bootstrap Mean and Standard Error, Interval of Minimal Bias, and Estimated Coverage Probability for Each Variable in the SEM

VariableObserved M

(SD)a μ̂B σ̂ B I P̂c

Adolescent TLFB % Days MRJ Use at Pre-Tx 57.26 (34.76) 56.89 7.27 53.25 – 60.52 0.99

Adolescent % TLFB Days MRJ Use at Post-Tx 23.95 (29.21) 23.55 6.32 20.39 – 26.71 0.99

Adolescent YSR Delinquency at Pre-Tx 9.83 (3.79) 9.80 0.78 9.41 – 10.19 0.97

Adolescent YSR Delinquency at Post-Tx 8.13 (3.43) 8.10 0.69 7.76 – 8.45 1.00

Mother’s FES Cohesion Score at Pre-Tx 5.52 (2.41) 5.58 0.47 5.35 – 5.82 0.81

Mother’s FES Organization Score at Pre-Tx 4.87 (2.56) 4.89 0.51 4.63 – 5.14 1.00

Mother’s FES Cohesion Score at Post-Tx 6.00 (2.66) 6.05 0.53 5.79 – 6.31 1.00

Mother’s FES Organization Score at Post-Tx 4.74 (2.40) 4.77 0.47 4.53 – 5.00 0.95

Proportion of Relationally Focused Meaning Change Interventions 0.13 (0.10) 0.13 0.02 0.11 – 0.14 0.98

Proportion of Individually Focused Seek Information Interventions 0.14 (0.08) 0.14 0.02 0.13 – 0.15 1.00

Proportion of Relationally Focused Seek Information Interventions 0.10 (0.06) 0.10 0.01 0.10 – 0.11 1.00

Proportion of Individually Focused Acknowledge Interventions 0.13 (0.09) 0.13 0.02 0.12 – 0.14 1.00

Proportion of Relationally Focused Acknowledge Interventions 0.11 (0.07) 0.11 0.01 0.11 – 0.12 0.98

Proportion of Relationally Focused Behavior Change Interventions 0.09 (0.11) 0.09 0.02 0.07 – 0.10 0.97

Note. SEM = structural equation modeling; μ̂B = mean of bootstrap sampling distribution based on 500 bootstrap samples from the observed data; σ̂ B = standard deviation of bootstrap sampling distribution (i.e., bootstrap standard error) based on 500 bootstrap samples from the observed data; I = interval of minimal bias computed as μ̂B ± 0.5∙σ̂ B. P̂c = Estimated coverage probability computed as the proportion of means based on 500 samples from the posterior predictive distribution that are contained within I; values of P̂c close to 1.0 suggest a good-fitting model from a Bayesian perspective; TLFB = Timeline Follow-Back interview; MRJ = marijuana use; Tx = treatment; YSR = Youth Self-Report scale; FES = Family Environment Scale.


Table S4

Maximum Likelihood and Empirical Bayes Parameter Estimates for the Structural Equation Model

Model and parameter

ML estimates EB posterior mean and percentilesEst. SE Mean Median P2.5 P97.5

Latent growth model for adolescent MRJ use and DLQAL(1) 57.26 7.41 57.70 57.62 44.68 71.73AL(2) –32.90 7.66 –32.86 –32.95 –47.05 –18.26AL(3) 9.82 0.81 9.80 9.79 8.36 11.20AL(4) –1.51 0.75 –1.57 –1.58 –2.97 –0.13PS(1,1) 327.82 188.84 280.10 225.20 30.49 838.00PS(2,2) 0.00c ––– 1102.80 912.20 108.90 3229.00PS(3,3) 6.53 1.88 5.52 4.41 0.59 16.52PS(4,4) 0.00a ––– 11.79 9.49 1.21 35.72TE(1,1) 711.91 170.50 1143.90 1103.30 684.10 1842.8TE(2,2) 638.79 212.62 859.80 827.30 498.20 1380.80TE(3,3) 7.81 2.42 13.52 12.94 8.14 21.80TE(4,4) 1.91 2.61 7.83 7.67 5.21 11.37

Latent change score model for family functionAL(5) 5.79 0.51 5.73 5.73 4.97 6.74AL(7) –4.89 2.16 –4.89 –4.91 –7.93 –1.77PS(5,5) 3.09 1.11 2.51 2.04 0.25 7.50PS(7,7) 0.00a ––– 0.06 0.05 0.01 0.17PS(5,1) –22.57 11.04 –22.77 –22.71 –44.26 –0.71LY(6,5), LY(8,6)

0.81 0.07 0.81 0.81 0.72 0.91

TE(5,5) 3.46 1.09 5.54 5.32 3.30 8.98TE(6,6) 3.93 1.15 6.19 5.95 3.70 10.09TE(7,7) 1.84 0.99 5.37 5.20 3.44 8.21TE(8,8) 2.49 0.79 5.08 4.89 3.15 8.03TE(7,5) 0.40 0.96 0.39 0.38 –1.51 2.25TE(8,6) 2.28 1.06 2.30 2.29 0.22 4.39

Measurement model for therapist behaviorAL(8) 0.13 0.02 0.13 0.13 0.10 0.16AL(9) 0.14 0.02 0.14 0.14 0.11 0.17AL(10) 0.10 0.01 0.10 0.10 0.08 0.12AL(11) 0.09 0.02 0.09 0.09 0.05 0.12PS(8,8) 0.01 0.00 0.01 0.01 0.01 0.02PS(10,10) 0.00 0.00 0.00 0.00 0.00 0.00PS(11,11) 0.01 0.00 0.01 0.01 0.01 0.02PS(9,8) –0.00 0.00 –0.00 –0.00 –0.01 –0.00


PS(11,9) –0.00 0.00 –0.00 –0.00 –0.01 –0.00LY(12,9) 0.94 0.10 0.93 0.93 0.77 1.11LY(13,10) 1.13 0.19 1.13 1.12 0.87 1.40TE(10,10) 0.00 0.00 0.01 0.01 0.00 0.01TE(11,11) 0.00 0.00 0.00 0.00 0.00 0.01TE(12,12) 0.00 0.00 0.01 0.01 0.00 0.01TE(13,13) 0.00 0.00 0.01 0.00 0.00 0.01BE(2,7) –4.58 6.48 –4.53 –4.48 –17.06 8.27BE(4,7) –2.07 0.82 –2.04 –2.05 –3.65 –0.39BE(7,8) 5.12 2.33 5.14 5.18 0.64 9.77BE(7,9) 17.49 5.86 17.49 17.63 6.41 28.29BE(7,10) 17.82 19.45 18.65 18.69 –11.16 49.35BE(7,11) 1.40 2.82 1.31 1.30 –4.14 7.02Note. ML = maximum likelihood; EB = empirical Bayes; MRJ = marijuana use; DLQ = delinquency.

aFixed to 0.00 due to convergence problems.


1: proc mcmc data = <input data set name> outpost = <name for output posterior data set> nbi = 2000 nmc=100000 seed=10000 thin=20 ntu=1000;

/*LINES 2-8 BELOW DECLARE THE 46 ESTIMATED PARAMETERS IN THE SEM. THE STARTING VALUE FOR EACH PARAMETER IS THE MODE OF THE CORRESPONDING PRIOR DISTRIBUTION, WHICH IS THE DEFAULT SETTING.*/

2: parms LY_6_5 LY_12_9 LY_13_10;3: parms TE_1_1 TE_2_2 TE_3_3 TE_4_4 TE_5_5 TE_6_6 TE_7_7 TE_8_8; 4: parms TE_10_10 TE_11_11 TE_12_12 TE_13_13 TE_7_5 TE_8_6;5: parms PS_1_1 PS_2_2 PS_3_3 PS_4_4 PS_5_5 PS_7_7 6: parms PS_8_8 PS_9_9 PS_10_10 PS_11_11 PS_5_1 PS_9_8 PS_11_9;7: parms BE_2_7 BE_4_7 BE_7_8 BE_7_9 BE_7_10 BE_7_11;8: parms AL_1 AL_2 AL_3 AL_4 AL_5 AL_7 AL_8 AL_9 AL_10 AL_11;

/*LINE 9 CREATES THE ERROR TERMS FOR THE LATENT VARIABLES WITH MULTIPLE INDICATORS. THE MEANS FOR THESE ERRORS TERMS ARE FIXED TO ZERO, AND THE VARIANCES ARE ESTIMATED BY THE HYPER-PARAMETERS ON THE DIAGONAL OF THE PS MATRIX. SEE LINES 25-33.*/

9: parms e_1 e_2 e_3 e_4 e_5 e_7 e_9 e_10;

/*LINES 10-64 SPECIFY THE PRIOR DISTRUBUTION FOR EACH SEM PARAMETER*/

10: prior LY_6_5 ~ normal(mean = .8078, sd = .0651); 11: prior LY_12_9 ~ normal(mean = .9443, sd = .1039); 12: prior LY_13_10 ~ normal(mean = 1.1277, sd = .1892); 13: prior TE_1_1 ~ gamma(2, scale = 711.9051/2); 14: prior TE_2_2 ~ gamma(2, scale = 638.7979/2); 15: prior TE_3_3 ~ gamma(2, scale = 7.8059/2); 16: prior TE_4_4 ~ gamma(2, scale = 1.9146/2); 17: prior TE_5_5 ~ gamma(2, scale = 3.4570/2); 18: prior TE_6_6 ~ gamma(2, scale = 3.9344/2); 19: prior TE_7_7 ~ gamma(2, scale = 1.8432/2); 20: prior TE_8_8 ~ gamma(2, scale = 2.4910/2); 21: prior TE_10_10 ~ gamma(2, scale = 0.0024/2); 22: prior TE_11_11 ~ gamma(2, scale = 0.0030/2); 23: prior TE_12_12 ~ gamma(2, scale = 0.0027/2); 24: prior TE_13_13 ~ gamma(2, scale = 0.0041/2); 25: prior e_1 ~ normal(mean = 0, var = PS_1_1);27: prior e_2 ~ normal(mean = 0, var = PS_2_2);28: prior e_3 ~ normal(mean = 0, var = PS_3_3);29: prior e_4 ~ normal(mean = 0, var = PS_4_4);30: prior e_5 ~ normal(mean = 0, var = PS_5_5);31: prior e_7 ~ normal(mean = 0, var = PS_7_7);32: prior e_9 ~ normal(mean = 0, var = PS_9_9);33: prior e_10 ~ normal(mean = 0, var = PS_10_10);34: prior PS_1_1 ~ gamma(2, scale = 327.8205/2); 35: prior PS_2_2 ~ gamma(2, scale = 1405.22/2); 36: prior PS_3_3 ~ gamma(2, scale = 6.5261/2); 37: prior PS_4_4 ~ gamma(2, scale = 14.49/2); 38: prior PS_5_5 ~ gamma(2, scale = 3.0944/2); 39: prior PS_7_7 ~ gamma(2, scale = .0652/2); 40: prior PS_8_8 ~ gamma(2, scale = 0.0110/2); 41: prior PS_9_9 ~ gamma(2, scale = 0.0043/2);


42: prior PS_10_10 ~ gamma(2, scale = 0.0006/2); 43: prior PS_11_11 ~ gamma(2, scale = 0.0126/2); 44: prior BE_2_7 ~ normal(mean = -4.5812, sd = 6.4849); 45: prior BE_4_7 ~ normal(mean = -2.0742, sd = 0.8156); 46: prior BE_7_8 ~ normal(mean = 5.1171, sd = 2.3298); 47: prior BE_7_9 ~ normal(mean = 17.4938, sd = 5.8625); 48: prior BE_7_10 ~ normal(mean = 17.8254, sd = 19.4468); 49: prior BE_7_11 ~ normal(mean = 1.3997, sd = 2.8166); 50: prior AL_1 ~ normal(mean = 57.2609, sd = 7.4104); 51: prior AL_2 ~ normal(mean = -32.9023, sd = 7.6600); 52: prior AL_3 ~ normal(mean = 9.8261, sd = .8071); 53: prior AL_4 ~ normal(mean = -1.5136, sd = 0.7454); 54: prior AL_5 ~ normal(mean = 5.7924, sd = 0.5058); 55: prior AL_7 ~ normal(mean = -4.8899, sd = 2.1582); 56: prior AL_8 ~ normal(mean = .1255, sd = .0224); 57: prior AL_9 ~ normal(mean = .1370, sd = .0166); 58: prior AL_10 ~ normal(mean = .1020, sd = .0128); 59: prior AL_11 ~ normal(mean = .0868, sd = .0240); 60: prior PS_5_1 ~ normal(mean = -22.5749, sd = 11.0448);61: prior PS_9_8 ~ normal(mean = -0.0038, sd = 0.0016);62: prior PS_11_9 ~ normal(mean = -0.0039, sd = 0.0014);63: prior TE_7_5 ~ normal(mean = 0.4004, sd = 0.9580);64: prior TE_8_6 ~ normal(mean = 2.2845, sd = 1.0565);

/*LINES 65-74 ARE THE LINEAR EQUATIONS FOR THE LATENT VARIABLES IN THE SEM. THERE IS NO EQUATION FOR THE ‘FAMILY FUNCTIONING AT POST-TREATMENT’ LATENT VARIABLE BECAUSE THIS VARIABLE IS SUBSUMED BY THE ‘CHANGE IN FAMILY FUNCTIONING’ LATENT DIFFERECE SCORE. SEE FIGURE 2.*/

65: MRJ_INT = AL_1 + (PS_5_1 - PS_5_1) + e_1; /*MARIJUANA USE INTERCEPT*/66: MRJ_SLP = AL_2 + e_2; /*MARIJUANA USE SLOPE*/67: DLQ_INT= AL_3 + e_3; /*DELINQUENCY INTERCEPT*/68: DLQ_SLP = AL_4 + e_4; /*DELINQUENCY SLOPE*/69: FF_1 = AL_5 + (PS_5_1 - PS_5_1) + e_5; /FAMILY FUNCTIONING AT PRE- TREATMENT*/70: FFCHG = AL_7 + (BE_7_8*AL_8) + (BE_7_9*AL_9) + (BE_7_10*AL_10) + (BE_7_11*AL_11) + e_7; /*CHANGE IN FAMILY FUNCTIONING FROM PRE- TO POST-TREATMENT*/71: REL_MC = AL_8 + (PS_9_8 - PS_9_8); /*RELATIONALLY FOCUSED MEANING CHANGE

INTERVENTIONS*/ 72: IND_GEN = AL_9 + (PS_9_8 - PS_9_8) + (PS_11_9 - PS_11_9) + e_9;

/*INDIVIDUALLY FOCUSED GENERAL INTERVENTIONS*/73: REL_GEN = AL_10 + e_10; /*RELATIONALLY FOCUSED GENERAL INTERVENTIONS*/74: REL_BC = AL_11 + (PS_11_9 - PS_11_9); /*RELATIONALLY FOCUSED BEHAVIOR

CHANGE INTERVENTIONS*//*THE MODEL STATEMENTS IN LINES 75-88 SPECIFY THE DISTRIBUTIONS OF THE MEASURED VARIABLES IN THE SEM. THE MEAN OF EACH VARIABLE IS A FUNCTION OF ONE OR MORE OF THE LATENT VARIABLES DEFINED ABOVE.*/

75: model PDUSMAAA ~ normal(mean = MRJ_INT, var = TE_1_1); /*MARIJUANA USE AT PRE-TX*/76: model PDUSMAAB ~ normal(mean = MRJ_INT + MRJ_SLP + (BE_2_7*FFCHG), var = TE_2_2); /*MARIJUANA USE AT 4-MONTHS*/77: model YSRDLQAA ~ normal(mean = DLQ_INT, var = TE_3_3); /*DELINQUENCY AT PRE-TX*/


78: model YSRDLQAB ~ normal(mean = DLQ_INT + DLQ_SLP + (BE_4_7*FFCHG), var = TE_4_4); /*DELINQUENCY AT 4-MONTHS*/79: model FESCPA ~ normal(mean = FF_1 + (TE_7_5 - TE_7_5), var = TE_5_5); /*FAMILY COHESION AT PRE-TX*/80: model FESORGPA ~ normal(mean = LY_6_5*FF_1 + (TE_8_6 - TE_8_6), var = TE_6_6); /*FAMILY ORGANIZATION AT PRE-TX*/81: model FESCPB ~ normal(mean = FF_1 + FFCHG + (TE_7_5 - TE_7_5), var = TE_7_7); /*FAMILY COHESION AT 4-MONTHS*/82: model FESORGPB ~ normal(mean = LY_6_5*FF_1 + FFCHG + (TE_8_6 - TE_8_6), var = TE_8_8); /*FAMILY ORGANIZATION AT 4-MONTHS. NOTE

THAT THE FACTOR LOADING FOR THIS VARIABLE IS CONSTRAINED TO BE EQUAL TO LY_6_5*/

83: model PROP_REL_MNG_CHG ~ normal(mean = REL_MC, var = PS_8_8); /*PROPORTION RELATIONALLY FOCUSED MEANING CHANGE INTERVENTIONS*/84: model PROP_IND_SK_INFO ~ normal(mean = IND_GEN, var = TE_10_10); /*PROPORTION INDIVIDUALLY FOCUSED SEEK-INFORMATION INTERVENTIONS*/85: model PROP_REL_SK_INFO ~ normal(mean = REL_GEN, var = TE_11_11); /*PROPORTION RELATIONALLY FOCUSED SEEK-INFORMATION INTERVENTIONS*/86: model PROP_IND_ACK ~ normal(mean = LY_12_9*IND_GEN, var = TE_12_12); /*PROPORTION OF INDIVIDUALLY FOCUSED ACKNOWLEDGE INTERVENTIONS*/87: model PROP_REL_ACK ~ normal(mean = LY_13_10*REL_GEN, var = TE_13_13); /*PROPORTION OF RELATIONALLY FOCUSED ACKNOWLEDGE INTERVENTIONS*/88: model PROP_REL_BEH_CH ~ normal(mean = REL_BC, var = PS_11_11); /*PROPORTION OF RELATIONALLY FOCUSED BEHAVIOR CHANGE INTERVENTIONS*/

/*LINE 89 REQUESTS THE POSTERIOR PREDICTIVE DISTRIBUTION (PPD). THE ‘outpred = <data set>’ OPTION NAMES A DATA SET IN WHICH THE POSTERIOR PREDICTIVE SAMPLES WILL BE STORED. THE ‘nsim = 500’ OPTION REQUESTS 500 DRAWS FROM THE PPD FOR EACH SAMPLE. THE ‘covariates = <data set>‘ OPTION NAMES THE BOOTSTRAP DATA SET WHICH WAS TRIMMED TO A SINGLE SAMPLE OF 500 OBSERVATIONS. THE NAMING OF THIS DATA SET PROMPTS PROC MCMC TO GENERATE 500 SAMPLES FROM THE PPD (ONE SAMPLE PER OBSERVATION IN THE COVARIATES = DATA SET).*/

89: preddist outpred= <data set name for posterior predictive distribution> nsim=500 covariates = <name of bootstrap data set with 500 observations>;

90: run;

Figure S1. SAS PROC Markov Chain Monte Carlo code.


proc fcmp; array psi [10,10] 280.1 0 0 0 -22.7735 0 0 0 0 00 1102.8 0 0 0 0 0 0 0 00 0 5.523 0 0 0 0 0 0 00 0 0 11.7874 0 0 0 0 0 0-22.7735 0 0 0 2.5051 0 0 0 0 00 0 0 0 0 0.0635 0 0 0 00 0 0 0 0 0 0.0115 -0.00382 0 00 0 0 0 0 0 -0.00382 0.00342 0 -0.003880 0 0 0 0 0 0 0 0.000532 00 0 0 0 0 0 0 -0.00388 0 0.0132;/* The array named psi in the preceding statement is the posterior PS matrix. It is a 10 x 10 matrix with non-zero elements equal to the corresponding element-specific posterior means. Zero elements were fixed and not estimated in the SEM. This matrix must be entered manually by the user.*/ array PSI_CHOL[10,10]; /*The array named PSI_CHOL is an empty 10 x 10 matrix in which the Cholesky factorization will be stored.*/ call chol(psi, PSI_CHOL, 0); /*The call chol routine computes the Cholesky factorization of psi and stores the values in PSI_CHOL. If psi is not positive definite, then PSI_CHOL will be a matrix of missing values.*/ rc = write_array('work.CHOL_PSI', PSI_CHOL); /* This statement stores the Cholesky factorization in a data set named 'work.CHOL_PSI').*/run;

proc print data = CHOL_PSI noobs; title1 "Cholesky Factorization of the PSI Matrix";run;

/*Below is a the SAS code for obtaining the Cholesky factorization of the TE matrix.*/

proc fcmp; array TE [12,12] 1143.9 0 0 0 0 0 0 0 0 0 0 00 859.8 0 0 0 0 0 0 0 0 0 00 0 13.5192 0 0 0 0 0 0 0 0 00 0 0 7.8316 0 0 0 0 0 0 0 00 0 0 0 5.538 0 0.391 0 0 0 0 00 0 0 0 0 6.1876 0 2.2991 0 0 0

00 0 0 0 0.391 0 5.3655 0 0 0 0 00 0 0 0 0 2.2991 0 5.0858 0 0 0

00 0 0 0 0 0 0 0 0.0053 0 0 00 0 0 0 0 0 0 0 0 0.00366 0 00 0 0 0 0 0 0 0 0 0 0.00612 00 0 0 0 0 0 0 0 0 0 0 0.00508;

array TE_CHOL[12,12]; call chol(TE, TE_CHOL, 0); rc = write_array('work.CHOL_TE', TE_CHOL);run;

proc print data = CHOL_TE noobs;


title1 "Cholesky Factorization of the TE Matrix";run;

Figure S2. SAS code for obtaining the Cholesky factorization of the posterior PS and TE

matrices. PS = psi; TE = theta-epsilon.


%let n = 5000; /*This statement creates a macro variable “n” which is the number of posterior draws in each MCMC chain, which is 5000 in the current analysis.*/%macro rhat (param); proc nlmixed data = CHAINS method = gauss noad technique = newrap qpoints = 10; /*The data set CHAINS contains the stacked MCMC draws for each parameter in the SEM. This data set is obtained from the PROC MCMC ouput*/ parms mu = 0 B W = 1; /* This statement creates the fixed effect

parameter mu and the random between- and within- chain variance parameters B and W*/

int = mu + e; model &param ~ normal(int, W); /*&param is a macro variable that is

used as a placeholder for each of the 46 parameters in the SEM*/ random e ~ normal(0, B) SUBJECT = Chain; ESTIMATE "W_&param" W; ESTIMATE "B_&param" B; ESTIMATE "R-hat_&param" sqrt(((((&n - 1)/&n)*W) + ((1/&n)*B))/W); /*The preceding ESTIMATE statements produce estimates of W, B, and R-hat, respectively, for the given SEM parameter represented by &param.*/ ods output AdditionalEstimates = RHAT_&param; title1 "R-hat for &param"; title2; run;%mend rhat;

The macro rhat was executed separately for each of the 46 parameters in the SEM.

Figure S3. PROC NLMIXED macro code for obtaining Gelman and Rubin’s R-hat diagnostic.

Documents

supp.apa.orgsupp.apa.org/psycarticles/supplemental/a0035889/CCP-CCP3... · Web viewMarkov Chain Monte Carlo (MCMC) simulation generates a chain of sequentially dependent draws from