A simulation study of the effect of sample size and level of interpenetration on inference from cross-classified multilevel logistic regression models

A simulation study of the effect of sample size and level of interpenetration on inference from cross-classified multilevel logistic regression models

Rebecca Vassallo ESRC Research Methods Festival, July 2012

2

Introduction

• Influence of the interviewer and area on survey response behaviour

• Reflects unmeasured factors including the interviewer’s and area’s characteristics

• Violation of the assumption of independence of observations

• Standard analytical techniques will underestimate standard errors and can result in incorrect inference (Snijders & Bosker, 1999)

• Multilevel modelling has become a popular method in analysing area and interviewer effects on nonresponse

3

Introduction

• Estimation problem relating to the identifiability of area and interviewer variation

• Interpenetrated sample design considered as the gold standard for separating interviewer effects from area effects

• Restrictions in field administration capabilities and survey costs only allow for partial interpenetration

• Multilevel cross-classified specification used in such cases (Von Sanden, 2004)

• No studies available examining the properties of parameter estimates from such models under different conditions

4

Study Aims

• Examine the implications of interviewer dispersal patterns within different scenarios on the quality of parameter estimates

• Percentage relative bias, confidence interval coverage, power of significance tests and correlation of random parameter estimates

• Different scenarios vary in sample sizes, overall rates of response, and the area and interviewer variance

• Identify the smallest interviewer pool and the most geographically-restrictive interviewer case allocation required for acceptable levels of bias and power

5

Methodology: Simulation Model

• Model: (); ~N(0, ); ~N(0, )

• STATA Version 12 calling MLwiN Version 2.25 through the ‘runmlwin’ command (Leckie & Charlton, 2011)

• Markov Chain Monte Carlo (MCMC) estimation method

• MCMC method produces less biased estimates compared to first-order marginal quasi-likelihood (MQL) and second-order penalised quasi-likelihood (PQL) (Browne, 1998; Browne & Draper, 2006)

• IRIDIS High Performance Computing Facility cluster at the University of Southampton

6

Methodology: Data Generating Procedure

• Overall probability of the outcome for the area and the interviewer with zero random effects determines overall intercept (fixed for all cases)

• Cluster-specific random effects for each interviewer and area generated separately from N(0, ) & N(0, )

• and are generated for every simulation, but maintained constant across different scenarios where the only factor that changes is interviewer case allocations

• The allocation of workload from different areas to specific interviewers is limited to a finite number of possibilities

7

Methodology: Data Generating Procedure

• () of each case are computed and converted to probabilities

• Values of the dependent variable - a dichotomous outcome for each case - are generated from a Bernoulli distribution with probability

• For each scenario of the experimental design, 1000 simulated datasets are generated using R Version 2.11.1

8

Methodology: Simulation Factors

• Simulated scenarios vary in the following factors:-the overall sample size (N)-the number of interviewers and areas (; ) -the interviewer-area classifications [which vary in terms of the number of areas each interviewer works in (maximum 6 areas) and the overlap in the interviewers working in neighbouring areas]-the ICC (variances & )-the overall probability of the outcome variable (π)

• Medium scenario design (similar to values observed in a real dataset - a realistic starting point): 120 areas (48 cases/area) allocated to 240 interviewers (24 cases/int), totalling 5760 cases, =0.3, =0.3, π=0.8

9

Methodology - Quality Assessment Measures• Correlation between area and interviewer parameter estimates.

High negative values indicate identifiability problems

• Percentage relative bias

• Confidence interval coverage for 95% Wald confidence interval and the 95% MCMC quantiles are compared to nominal 95%

• Power of Wald test - proportion of simulations in which the null hypothesis is correctly rejected

11000 𝜃 − 𝜃𝜃 ∗1001000𝑖=1

11000 𝑐𝑜𝑣൫𝜎𝑢2 𝜎𝑣2 ൯𝜎𝑢ෞ�� 𝜎𝑣ෞ��1000𝑖=1

Results - Power of Tests

• For medium scenario power ≈1 for all interviewer case allocations

• For smaller N, more sparse allocations are required to get power >0.85

• Lower (0.2) results in lower power

• When = more interviewer dispersion is required for acceptable levels of power

• Higher π (0.9) requires 2 areas/int for power>0.9

• Reduced interviewer overlapping for a constant number of areas/int does not improve power

10

Results - Correlation between & Estimates• For all scenarios, high negative ρ (>-0.4) are observed

when interviewers work in 1 area only

• No substantial change in ρ with varying total sample sizes

• Very high negative ρ (up to -0.9) for =scenarios; ρ only reduced to <-0.1 when interviewer is working in 4+ areas (compared to 2+ areas/int for =2*scenarios)

• Higher ρ with increasing π up till 2 areas/int allocations; thereafter no change in ρ by π

• Lower ρ with increasing up till 3 areas/int allocations

• Lower ρ with less interviewer overlapping for the 2 areas/int cases

11

Results – Percentage Relative Bias

• In most scenarios N=5760, the relative percentage bias is around 1-2% once interviewers are allocated to 2+ areas

• Further interviewer dispersion (3+ areas) & less interviewer overlapping do not yield systematic drops in bias

• When interviewers are working in 2+ areas, the bias in the estimate is generally greater than the bias in estimate [when =2*]

• Greater bias observed for smaller sample sizes, with the scenario including 1440 cases with =obtaining bias values between 5-13% for all allocations

12

Results - Confidence Interval Coverage

• Close to 95% nominal rate in all scenarios

• Some cases of under-coverage or over-coverage for scenarios when interviewer works in just one area

-87% coverage (N=5760, =2*, =0.2, =0.3, π=0.8, one area/int) for CI-88% coverage (N=2880 or N=1440, =2*, =0.3, =0.3, π=0.8, one area/int) for CI-100% coverage (N=5760 or 2880 or 1440, =, =0.3, =0.3, π=0.8, one area/int) for for and for CIs

• No clear evidence that the MCMC quantiles perform better than the Wald asymptotic normal CIs

13

Conclusion

• Interpenetration not required to distinguish between area and interviewer variation

• Good quality estimates obtained for large sample sizes (≈6000 cases) if interviewers work in at least two areas

• Better estimates obtained when the number of interviewers is greater than the number of areas

• Higher overall probabilities & smaller variances (smaller ICC) require more interviewer dispersion for some survey conditions

• The extent of interviewer overlapping shown not to be important

• Results and their implications can be extended to other applications

14

Acknowledgements

• University of Southampton, School of Social Sciences Teaching Studentship

• UK Economic and Social Research Council (ESRC), PhD Studentship (ES/1026258/1)

• Gabriele B. Durrant & Peter W. F. Smith, PhD Supervisors

15

References

• Browne, W. J. (1998). Applying MCMC Methods to Multi-level Models. PhD thesis, University of Bath.

• Browne, W. & Draper, D. (2006). A comparison of Bayesian and likelihood-based methods for fitting multilevel models. Bayesian Analysis, 1, 473-514.

• Leckie, G. & Charlton, C. (2011). runmlwin: Stata module for fitting multilevel models in the MLwiN software package. Centre for Multilevel Modelling, University of Bristol.

• Snijders, T.A.B. & Bosker, R.J. (1999). Multilevel Analysis: an introduction to basic and advanced multilevel modelling. London: Sage.

• Von Sanden, N. D. (2004). Interviewer effects in household surveys: estimation and design. Unpublished PhD thesis, School of Mathematics and Applied Statistics, University of Wollongong.

16

Documents

A simulation study of the effect of sample size and level of interpenetration on inference from cross-classified multilevel logistic regression models