Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

  • View
    218

  • Download
    0

Embed Size (px)

DESCRIPTION

Webinar 4: Outline Considerations for quantitative analyses – Dr. Sharon Koon – WWC evidence standards & strong studies – Calculating attrition – Calculating baseline equivalence – Statistical adjustments Considerations for qualitative analyses – Dr. La’Tara Osborne-Lampkin Question & answer session 3

Text of Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results...

Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1 Information and materials mentioned or shown during this presentation are provided as resources and examples for the viewer's convenience. Their inclusion is not intended as an endorsement by the Regional Educational Laboratory Southeast or its funding source, the Institute of Education Sciences (Contract ED-IES-12-C- 0011). In addition, the instructional practices and assessments discussed or shown in these presentations are not intended to mandate, direct, or control a States, local educational agencys, or schools specific instructional content, academic achievement system and assessments, curriculum, or program of instruction. State and local programs may use any instructional content, achievement system and assessments, curriculum, or program of instruction they wish. 2 Webinar 4: Outline Considerations for quantitative analyses Dr. Sharon Koon WWC evidence standards & strong studies Calculating attrition Calculating baseline equivalence Statistical adjustments Considerations for qualitative analyses Dr. LaTara Osborne-Lampkin Question & answer session 3 CONSIDERATIONS FOR QUANTITATIVE ANALYSES Dr. Sharon Koon 4 Distinction between WWC evidence standards and additional qualities of strong studies WWC design considerations : Two groupstreatment (T) and comparison (C). For randomized controlled trials (RCTs), low attrition For quasi-experimental designs (QEDs), baseline equivalence between T and C groups. Contrast between T and C groups measures impact of the treatment. Valid and reliable outcome data used to measure the impact of a treatment. No known confounding factors. Outcome(s) not overaligned with the treatment. Same data collection processsame instruments, same time/yearfor the T and C groups. 5 Source:Experiments-in-Education-Version-2.pdf Distinction between WWC evidence standards and additional qualities of strong studies (cont.) Additional qualities of strong studies: Pre-specified and clear primary and secondary research questions. Generalizability of the study results. Clear criteria for research sample eligibility and matching methods. Sample size large enough to detect meaningful and statistically significant differences between the T and C groups overall and for specific subgroups of interest. Analysis methods reflect the research questions, design, and sample selection procedures. A clear plan to document the implementation experiences of the T and C conditions. 6 Source:Experiments-in-Education-Version-2.pdf Determinants of a What Works Clearinghouse (WWC) study rating Source:7 Topics for discussion Attrition Baseline equivalence Calculation of baseline equivalence Adjustments for nonequivalence Effect-size corrections Cluster correction Multiple comparison correction Handling missing data 8 Attrition For RCTs, the WWC is concerned about both overall attrition (i.e., the rate of attrition for the entire sample) and differential attrition (i.e., the difference in the rates of attrition for the intervention and comparison groups) because both types of attrition contribute to the potential bias of the estimated effect. Source:9 Attrition (cont.) Overall attrition = Number without observed data/number randomized Differential attrition = [T without observed data/number T randomized] [C without observed data/number C randomized] Attrition boundaries: liberal or conservative In order to be deemed an RCT with low attrition, a cluster RCT that reports an individual-level analysis must have low attrition at two levels. First, it must have low attrition at the cluster level. Second, the study must have low attrition at the subcluster level, with attrition based only on the clusters remaining in the sample. Source:10 Attrition (cont.) Source:11 Baseline equivalence For continuous outcomes, it is determined by the difference between the mean outcome for the T group and the mean outcome for the C group, divided by the pooled within-group standard deviation of the outcome measure (i.e., standardized mean difference). For dichotomous outcomes, it is determined by the difference in the probability of the occurrence of an event. Source:12 Statistical adjustment for nonequivalence at baseline For differences in baseline characteristics that are between 0.05 and 0.25 standard deviations, the analysis must include a statistical adjustment for the baseline characteristics to meet the baseline equivalence requirement. A number of different techniques can be used, including regression adjustment and analysis of covariance (ANCOVA). The critical factor is that the appropriate baseline characteristics must be included in the analysis at the individual level (i.e., the unit of analysis). Source:13 Difference-in-difference adjustment The WWC applies this adjustment to effect size calculations based on unadjusted group means when the study is: a QED with differences in baseline characteristics less than.05 an RCT with low attrition and differences in baseline characteristics an RCT with high attrition and differences in baseline characteristics less than.05 Source:14 Cluster correction A mismatch problem occurs when random assignment is carried out at the cluster level (e.g., school level) and the analysis is conducted at the individual level (e.g., teacher level), but the correlation among students within the same clusters is ignored in computing the standard errors of the impact estimates. The standard errors of the impact estimates generally will be underestimated, thereby leading to overestimates of statistical significance. Source:15 Cluster correction (cont.) The WWC computes clustering-corrected statistical significance estimates. The basic approach to the clustering correction is first to compute the t-statistic corresponding to the effect size that ignores clustering and then correct both the t-statistic and the associated degrees of freedom for clustering based on sample sizes, number of clusters, and an estimate of the intra-class correlation (ICC). The default ICC value is 0.20 for achievement outcomes and 0.10 for behavioral and attitudinal outcomes. The statistical significance estimate corrected for clustering is then obtained from the t-distribution using the corrected t- statistic and degrees of freedom. Source:16 Multiple comparisons correction Repeated tests of highly correlated outcomes will lead to a greater likelihood of mistakenly concluding that the differences in means for outcomes of interests between the T and C groups are significantly different from zero (called Type I error in hypothesis testing). The WWC uses the Benjamini-Hochberg (BH) correction to reduce the possibility of making this type of error. Source:17 Multiple comparisons correction (cont.) The BH correction is used in three types of situations: studies that estimated effects of the intervention for multiple outcome measures in the same outcome domain using a single comparison group, studies that estimated effects of the intervention for a given outcome measure using multiple comparison groups, and studies that estimated effects of the intervention for multiple outcome measures in the same outcome domain using multiple comparison groups. Source:18 Handling missing data RCTs & QEDs: Complete case analysis with no adjustment for missing data (for example, unadjusted means) or with covariate adjustment Low-attrition RCTs only: Complete case analysis with nonresponse weights Multiple imputation, separately by condition and using an established method (however, can not be used to meet the attrition standard) Maximum likelihood, separately by condition and using an established method Source:19 CONSIDERATIONS FOR QUALITATIVE ANALYSES Dr. LaTara Osborne-Lampkin 20 (Minchiello et al., 1990, p.5) QualitativeQuantitative ConceptualConcerned with understanding human behavior from the informants perspective Concerned with discovering facts about human behavior Assumes a dynamic & negotiated reality Assumes a fixed & measurable reality MethodologicalData are collected through participant observation & interviews Data are collected through measuring things Data are analyzed by themes from descriptions by informants Data are analyzed through numerical comparisons & statistical inferences Data are reported through narratives and the language of the informant Data are reported through statistical analyses 21 When analyzing qualitative data Use an iterative approach guided by prior data collection and analysis Use multiple, reliable researchers for analysis and interpretation Document and outline steps and decisions made for data analysis (i.e., develop an audit trail) Document the basis for inferences Establish structural corroboration or coherence (Miles & Huberman, 1994; Miles, Huberman & Soldana, 2014, Patton, 1987) 22 Five key considerations Triangulate data Ensure representativeness Look for competing explanations Analyze negative cases Keep methods & data in context 23 Triangulate data Use multiple methods to study a program Collect multiple types of data on the same question Use different interviewers to avoid biases of any one different data collector and interviewers working alone Use multiple perspectives (or theories) to interpret a set of data 24 Practical strategies to triangulate Data sources Compare interview data focus group semi-structured interview data With observational data documents Validate observational data with documents Participants Compare what participants say in public with what they say in private (for example, focus group data vs. individual interview data) Check consistency of what participants do and say over time Compare the perspectives of individuals within and across stakeholder groups 25 Verify fit and work representativeness in the data 26 Look for rival and competing explanations May employ an inductive or log