Upload
jonas-ranstam
View
1.379
Download
1
Embed Size (px)
DESCRIPTION
Medical research relies heavily on statistical inference for generalization of findings, for assessing the uncertainty in applying these findings on new patients. SPSS and similar packages has made complex statistical calculations possible with no or very little understanding of statistical inference. As a consequence, research findings are misunderstood, the presentation of them confusing, and their reliability massively overestimated.
Citation preview
The SPSS-effect on medical research
Jonas Ranstam
Generalization
Medical research studies are typically performed for the benefit of other subjects than the participants.
Treatment effects in the observed patients
x, SD
Treatment effects in new patients
μ, σ
μ (95% CI: μll - μ
ul)^
What we do know (have observed)
What we want to know (but never will)
The best estimate and its uncertainty
The uncertainty can in some cases also be presented as a probability value
Medical research and generalization
Treatment effects in the observed patients
x, SD
What we do know (have observed)
p < 0.05 or ns
Some weird stuff that no one understands but is necessary for getting manuscripts accepted
Medical researchin practice
Treatment effects in the observed patients
x, SD
What we do know (have observed)
p < 0.05 or ns
Little (if anything) is mentioned about the uncertainty in the generalization of the findings.
Many (if not all) authors severely underestimate the uncertainty of their findings.
Medical researchin practice
Statistical significance and insignificance is typically described as a property of the sample, not the population: “there was a significant difference”.
The presented conclusions are usually a summary of what has been observed in the sample.
SD, SEM and 95%Ci are all believed to describe the variability of observed data.
This is the SPSS-effect on medical research.
Statistics is about much more than statistical significance
Important phenomena are neglected
Examples:
- Regression-to-the-mean (RTM)- Consequences of missing data
The placebo effect and regression to the mean
The Placebo effect is a real phenomenon
In conclusion, we believe that investigating the formation of behavioral and biological changes due to placebos deserves future efforts, as the placebo effect is a “real” neurobiological phenomenon that has important implications for clinical neuroscience research and medical care.
Meissner K. et al. The Placebo Effect: Advances from Different Methodological Approaches. J Neurosci 2011; 31:16117–16124
Problem
The vast majority of reports on placebos have estimated the effect of placebo as the change from baseline in the placebo group of a randomized trial after treatment.
The effect of placebo can thus not be distinguished from the natural course of the disease, regression to the mean, and the effects of other factors.
Systematic review of the placebo effect
114 trials - 8525 patients
We included studies if patients were assigned randomly to a placebo group or an untreated group (often there was also a third group that received active treatment).
Publication bias?There was significant heterogeneity among the trials with continuous outcomes (P<0.001). The magnitude of the effect of placebo decreased with increasing sample size (P=0.05), indicating a possible bias related to the effects of small trials.
ConclusionIn conclusion, we found little evidence that placebos in general have powerful clinical effects.
Placebos had no significant pooled effect on subjective or objective binary or continuous objective outcomes.
We found significant effects of placebo on continuoussubjective outcomes and for the treatment of pain butalso bias related to larger effects in small trials.
The use of placebo outside the aegis of a controlled, properly designed clinical trial cannot be recommended.
Regression to the mean (RTM)
When an extreme group is selected from a population based on the measurement of a particular variable, and a second measurement is taken for the same group, the second mean will be closer to the population mean than the first measurement.
RTM
Any measurement taken consists of two components: the ‘true’ value plus a random error component. It is the random error component that contributes to RTM. If the value of the random error component is large, then the magnitude of the corresponding RTM effects are increased.
Hypothetical example: SF-36 PF
Baseline: mean = 80, SD = 17Follow up: mean = 80, SD = 17p ≈ 1.0
Hypothetical example: SF-36 PF
Baseline: mean = 48.7, SD = 8.6Follow up: mean = 59.2, SD = 16.7p < 0.001
Barnett AG, van der Pols JC, Dobson AJ. Regression to the mean: what it is and how to deal with it. Int J Epidemiol 2005;34:215–220
RTM - Easy to quantify (for Normally distributed endpoints)
Hypothetical example of RTM in SF-36 PF
Mean = 80, SD = 17, cut off = 60
r RTM0.0 28.4 0.1 25.50.2 22.70.3 19.90.4 17.00.5 14.20.6 11.30.7 8.50.8 5.70.9 2.81.0 0
RTM
Evaluation of a single groups’ development over time should be avoided, or at least include a comparison with the expected RTM effect.
Examples
Diagnostic tests
New treatments
Public health efforts
Health care management
Clinical audits
Hospital comparisons
If one were a policy maker alert to the possibilities of using RTM to ‘prove’ an initiative, one might target hospitals at the bottom of the league table with an initiative, extra resources, for example. RTM, combined with a floor effect, will ensure that such a policy can be ‘proven’ to work.
Morton V, Torgerson DJ. Regression to the mean: treatment effect without the intervention. J Eval Clin Pract 2005;11:59-65.
The consequences of missing values
RANDOMIZATION
Inclusion/exclusion criteria
TRT CTRbaselineTRTTRT
baseline
Lost to follow upLost to follow up
TRTFollow up
CTRFollow up
A Randomized Trial
Missing dataMissing data
Study populations
Intention-to-treat (ITT)
Patients are analyzed according to randomization outcome irrespective of received treatment or any protocol violation.
Per-protocol (PP)
The subgroup of the ITT population that has been treated according to the study protocol.
Full Analysis Set (FAS)
The ITT population with exclusion of missing data.
Consequence of missing data
Precision
- reduced power- variability
Validity
- comparability of treatment groups- the representativity of the results
Missing data definitions
Missing outcome values
MCAR (missing completely at random)- independent of both observed and unobserved variables.
MAR (missing at random)- depend only on observed variables.
MNAR (missing not at random)- depend on unobserved variables.
Handling of missing data
1. Complete case analysis (violates the ITT principle, not FAS)
2. Single imputation methods, e.g. LOCF, (biased p-values)
3. Multiple imputation, MI, (requires MCAR or MAR)
4. Mixed models, GEE (requires MCAR or MAR)
Sensitivity analysis
- Compare FAS results with Complete Case analysis results.
- Define missing values as failures.
- Worst case scenario analysis: Define missing values as failures in TRT and successes in CTR.