Transcript
Page 1: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

1

Developing an evaluation of professional developmentWebinar #4: Going Deeper

into Analyzing Results

Page 2: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

2

Information and materials mentioned or shown during this presentation are provided as resources and examples for the viewer's convenience. Their inclusion is not intended as an endorsement by the Regional Educational Laboratory Southeast or its funding source, the Institute of Education Sciences (Contract ED-IES-12-C-0011).

In addition, the instructional practices and assessments discussed or shown in these presentations are not intended to mandate, direct, or control a State’s, local educational agency’s, or school’s specific instructional content, academic achievement system and assessments, curriculum, or program of instruction. State and local programs may use any instructional content, achievement system and assessments, curriculum, or program of instruction they wish.

Page 3: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Webinar 4: Outline• Considerations for quantitative analyses –

Dr. Sharon Koon– WWC evidence standards & strong studies– Calculating attrition– Calculating baseline equivalence– Statistical adjustments

• Considerations for qualitative analyses –Dr. La’Tara Osborne-Lampkin

• Question & answer session

3

Page 4: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

CONSIDERATIONS FOR QUANTITATIVE ANALYSES

Dr. Sharon Koon

4

Page 5: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Distinction between WWC evidence standards and additional

qualities of strong studies• WWC design considerations :

– Two groups—treatment (T) and comparison (C). – For randomized controlled trials (RCTs), low attrition– For quasi-experimental designs (QEDs), baseline equivalence

between T and C groups. – Contrast between T and C groups measures impact of the

treatment. – Valid and reliable outcome data used to measure the impact of

a treatment. – No known confounding factors.– Outcome(s) not overaligned with the treatment.– Same data collection process—same instruments, same

time/year—for the T and C groups.

5

Source: http://www.dir-online.com/wp-content/uploads/2015/11/Designing-and-Conducting-Strong-Quasi-Experiments-in-Education-Version-2.pdf

Page 6: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Distinction between WWC evidence standards and additional qualities of

strong studies (cont.)• Additional qualities of strong studies:

– Pre-specified and clear primary and secondary research questions. – Generalizability of the study results. – Clear criteria for research sample eligibility and matching

methods. – Sample size large enough to detect meaningful and statistically

significant differences between the T and C groups overall and for specific subgroups of interest.

– Analysis methods reflect the research questions, design, and sample selection procedures.

– A clear plan to document the implementation experiences of the T and C conditions.

6

Source: http://www.dir-online.com/wp-content/uploads/2015/11/Designing-and-Conducting-Strong-Quasi-Experiments-in-Education-Version-2.pdf

Page 7: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Determinants of a What Works Clearinghouse (WWC) study rating

Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19

7

Page 8: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Topics for discussion• Attrition• Baseline equivalence– Calculation of baseline equivalence– Adjustments for nonequivalence

• Effect-size corrections– Cluster correction– Multiple comparison correction

• Handling missing data

8

Page 9: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Attrition• For RCTs, the WWC is concerned

about both overall attrition (i.e., the rate of attrition for the entire sample) and differential attrition (i.e., the difference in the rates of attrition for the intervention and comparison groups) because both types of attrition contribute to the potential bias of the estimated effect.

Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19

9

Page 10: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Attrition (cont.)• Overall attrition = Number without observed data/number

randomized• Differential attrition = [T without observed data/number T

randomized] – [C without observed data/number C randomized]

• Attrition boundaries: liberal or conservative• In order to be deemed an RCT with low attrition, a cluster

RCT that reports an individual-level analysis must have low attrition at two levels. First, it must have low attrition at the cluster level. Second, the study must have low attrition at the subcluster level, with attrition based only on the clusters remaining in the sample. Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19

10

Page 12: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Baseline equivalence• For continuous outcomes, it is determined by

the difference between the mean outcome for the T group and the mean outcome for the C group, divided by the pooled within-group standard deviation of the outcome measure (i.e., standardized mean difference).

• For dichotomous outcomes, it is determined by the difference in the probability of the occurrence of an event.

Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19

12

Page 13: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Statistical adjustment for nonequivalence at baseline

• For differences in baseline characteristics that are between 0.05 and 0.25 standard deviations, the analysis must include a statistical adjustment for the baseline characteristics to meet the baseline equivalence requirement.

• A number of different techniques can be used, including regression adjustment and analysis of covariance (ANCOVA).

• The critical factor is that the appropriate baseline characteristics must be included in the analysis at the individual level (i.e., the unit of analysis).

Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19

13

Page 14: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Difference-in-difference adjustment

• The WWC applies this adjustment to effect size calculations based on unadjusted group means when the study is:– a QED with differences in baseline

characteristics less than .05 – an RCT with low attrition and differences in

baseline characteristics– an RCT with high attrition and differences in

baseline characteristics less than .05Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19

14

Page 15: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Cluster correction• A “mismatch” problem occurs when random

assignment is carried out at the cluster level (e.g., school level) and the analysis is conducted at the individual level (e.g., teacher level), but the correlation among students within the same clusters is ignored in computing the standard errors of the impact estimates.

• The standard errors of the impact estimates generally will be underestimated, thereby leading to overestimates of statistical significance.

Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19

15

Page 16: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Cluster correction (cont.)• The WWC computes clustering-corrected statistical significance

estimates.• The basic approach to the clustering correction is first to compute

the t-statistic corresponding to the effect size that ignores clustering and then correct both the t-statistic and the associated degrees of freedom for clustering based on sample sizes, number of clusters, and an estimate of the intra-class correlation (ICC).

• The default ICC value is 0.20 for achievement outcomes and 0.10 for behavioral and attitudinal outcomes.

• The statistical significance estimate corrected for clustering is then obtained from the t-distribution using the corrected t-statistic and degrees of freedom.

Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19

16

Page 17: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Multiple comparisons correction

• Repeated tests of highly correlated outcomes will lead to a greater likelihood of mistakenly concluding that the differences in means for outcomes of interests between the T and C groups are significantly different from zero (called Type I error in hypothesis testing).

• The WWC uses the Benjamini-Hochberg (BH) correction to reduce the possibility of making this type of error.

Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19

17

Page 18: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Multiple comparisons correction (cont.)

• The BH correction is used in three types of situations: • studies that estimated effects of the intervention for

multiple outcome measures in the same outcome domain using a single comparison group,

• studies that estimated effects of the intervention for a given outcome measure using multiple comparison groups, and

• studies that estimated effects of the intervention for multiple outcome measures in the same outcome domain using multiple comparison groups.

Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19

18

Page 19: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Handling missing data• RCTs & QEDs: Complete case analysis with no

adjustment for missing data (for example, unadjusted means) or with covariate adjustment

• Low-attrition RCTs only: – Complete case analysis with nonresponse weights– Multiple imputation, separately by condition and

using an established method (however, can not be used to meet the attrition standard)

– Maximum likelihood, separately by condition and using an established method

Source: http://ies.ed.gov/ncee/wwc/multimedia.aspx?sid=18

19

Page 20: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

CONSIDERATIONS FOR QUALITATIVE ANALYSES

Dr. La’Tara Osborne-Lampkin

20

Page 21: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

(Minchiello et al., 1990, p.5)

Qualitative Quantitative

Conceptual Concerned with understanding human behavior from the informant’s perspective

Concerned with discovering facts about human behavior

Assumes a dynamic & negotiated reality

Assumes a fixed & measurable reality

Methodological Data are collected through participant observation & interviews

Data are collected through measuring things

Data are analyzed by themes from descriptions by informants

Data are analyzed through numerical comparisons & statistical inferences

Data are reported through narratives and the language of the informant

Data are reported through statistical analyses

21

Page 22: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

When analyzing qualitative data

Use an iterative approach guided by prior data collection and analysis

Use multiple, “reliable” researchers for analysis and interpretation

Document and outline steps and decisions made for data analysis (i.e., develop an audit trail)

Document the basis for inferences Establish structural corroboration or coherence

(Miles & Huberman, 1994; Miles, Huberman & Soldana, 2014, Patton, 1987)

22

Page 23: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

23

Five key considerations

Triangulate data

Ensure representativeness

Look for competing explanations

Analyze negative casesKeep methods &

data in context

Page 24: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Triangulate data

Use multiple methods to study a program Collect multiple types of data on the

same question Use different interviewers to avoid biases

of any one different data collector and interviewers working alone

Use multiple perspectives (or theories) to interpret a set of data

24

Page 25: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

25

Practical strategies to triangulate

Data sources Compare interview data

focus group semi-structured interview data

With observational data documents

Validate observational data with documents

Participants Compare what participants

say in public with what they say in private (for example, focus group data vs. individual interview data)

Check consistency of what participants do and say over time

Compare the perspectives of individuals within and across stakeholder groups

Page 26: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Verify “fit” and “work” “representativeness” in the

data…

26

Page 27: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Look for rival and competing explanations

• May employ an inductive or logical process

• Use data to support alternative explanations that are grounded in logic and theory

• Weight the evidence and look for best fit

27

Page 28: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Negative cases• “Exceptions that prove the rule”• Counter evidence

28

Page 29: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

29

Keeping methods and data in context

Limit conclusions to:• those situations,• time periods,• persons, and • contexts for which the

data are applicable.

Keeping things in context is the cardinal principle of qualitative analysis (Patton, 1987).

Page 30: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

30

Questions & Answers

Homework:Bring remaining questions to session 5

Page 31: Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1

Developing an evaluation of professional development

• Webinar 5: Going Deeper into Interpreting Results & Presenting Findings 1/21/2016, 2:00pm

31