1
The Road Map to a Successful Study Design
Lisa Kaltenbach, MSBiostatistician II
February 6, [email protected]
Office: D2220 MCN
2
3
Can’t go back in time in research!
“To call in the statistician after the experiment is done may be no more than asking him to perform a postmortem examination: he may be able to say what the experiment dies of.”
-R.A. Fisher, Indian Statistical Congress, Sankhya, ca 1938
4
Outline: The Road to Success
• How to begin clinical research
• Important considerations when designing a study
• Types of study designs
• Examples
5
Components of a Study Protocol:
• The structure of a research project is set out in its protocol.
• Protocols are well known as devices for seeking grant funds, but they also help the investigator organize his/her research in a logical, focused, and efficient way.
Element Purpose
Research questions What questions will the study address?
Significance (background) Why are these questions important?
Design
Time frame
Epidemiologic approach
How is the study structured?
Subjects
Selection criteria
Sampling design
Who are the subjects and how will the be selected?
Variables
Predictor variables
Confounding variables
Outcome variables
What measurements will be made/recorded?
Statistical issues
Hypotheses
Sample size
Analytic approach
How large is the study and how will it be analyzed?
6
What do you wish to learn?• If only one question could be answered by the project, what would that question
be?• Often people would like to do describe how likely a theory or hypothesis is in light
of a particular set of data. This is not possible in the commonly used classical/frequentist approach to statistics. Instead, statistics talks about the probability of observing particular sets of data, assuming a theory holds. We are not allowed to say, "Because I've seen these data, there is only a small probability that this theory is true." Instead, we say, "The probability of seeing data like these is very small if the theory is true."
• In order to show an effect exists,– statistics begins by assuming there is no effect.– Prior to collecting data, rules are chosen to decide whether the data are consistent with
the assumption of no effect.– If the data are found to be inconsistent with the assumption, the assumption must be
false and there is, in fact, an effect.• Classical statistics works by comparing study data to what is expected when there
is nothing. • If the data are not typical of what is seen when there is nothing, there must be
something. Usually "not typical" means that some summary of the data is so extreme that it is seen less than 5% of the time when there is nothing.
7
Judging a project's feasibility• Can everything that needs to be measured be measured? • If the study involves some condition, can we define and recognize it?
– What is an unhealthy eating behavior?– What's the difference between a cold and the flu? – What do we mean by family income or improved nutritional status?
• How accurate and consistent are the measurements? How accurate do they need to be? What causes them to be inaccurate or inconsistent?– Calcium intake is easy to measure because there are only a few major
sources of calcium. Salt intake is hard because salt is everywhere.– Will respondents reveal their income? – Can others get the same value (inter-laboratory, inter-technician variability)?
• How do we choose among different measurement techniques?– Is a mechanical blood pressure cuff better than using a stethoscope? – Is there a gold standard? Is it worth paying for?
• Sometimes merely measuring something changes it in unexpected ways.– Does asking people to keep records of dietary intake cause them to change
their intake? • Resources (time and money)
8
Types of design
Analytic
Experimental Non-experimental
Cohort
Cross-sectional
Case-control
Other
RandomizedClinical Trial
Non-randomizedClinical Trial
Descriptive
Community Survey
9
Considerations when choosing a Study Design
• No one approach is always better than the others.• Each research question requires a judgment about
which design is the most efficient way to get a satisfactory answer.
• A common sequence for studying a topic: – Descriptive studies
• How common is estrogen treatment in women after menopause?
– Analytic studies to evaluate associations and discover cause-and-effect relationships
• Is taking estrogen after menopause associated with lower risk of CHD?
– Clinical trial to establish the effects of an intervention • Does hormone treatment alter the incidence of CHD?
10Examples of common clinical research designs used to study whether hormone therapy after menopause
prevents coronary heart disease
Study Design Key Feature Example
Experimental Design
Randomized blinded trial
Two groups created by a random process, and a blinded intervention
The investigator randomly assigns women to receive hormone or identical placebo, then follows both treatment groups for several years to observe the incidence of heart attacks.
Observational Designs
Cohort study A group followed over time The investigator examines a cohort of women yearly for several years, observing the incidence of heart attacks in hormone users and non-users.
Case-control study Two groups, based on the outcome
The investigator examines a group of women with heart attacks (the “cases”) and compares them with a group of healthy women (the controls) asking about hormone use.
Cross-sectional study A group examined at one point in time
The investigator examines the group of women once, observing the prevalence of a history of heart attacks in hormone users and non-users.
11
Statistical Issues
3 Step Process:1. Define specific aim
-Hypothesis: Women who receive estrogen treatment after menopause will have fewer heart attacks than those who do not.
2. Calculate the sample size, the number of subjects needed to observe the expected difference in outcome between study groups with a reasonable degree of probability, or power.3. Select statistical methods needed to produce an acceptable level of precision when confidence intervals are calculated for the means, proportions, or other descriptive statistics.
12
Randomized Clinical Trials
• In simplest implementation:• Subject enrolls in study• Randomly assigned to one of ≥ 2 treatments• Followed up until end of study or outcome measure is obtained
• Outcome comparisons are made among treatment groups• Treatment groups should be comparable on measured and
unmeasured covariates due to randomization• Strongest design to establish causal relationships
• May be beneficial to blind subjects/investigators to treatment groups
13
Randomized Clinical Trial Example
14
Cohort Studies
• Exposure not randomly assigned, but assessed
– Sample selection and analysis can minimize confounding
• Need sufficient # of subjects/events
• Prospective Cohort
– Outcomes are future events
• Retrospective Cohort
– Outcomes have already occurred
• Can study multiple outcomes
• No control over risk factors, or insufficient numbers
15
Cohort Study Example
Example with dichotomous risk factor
and outcome
16
Cross-sectional Studies
• All variables are measured at same time
• Valuable for providing descriptive information about prevalence
• But weaker evidence for causality as predictor is not shown to precede outcome
17
Cross-Sectional Study Example
18
Case-Control
1. Subjects are identified as cases based on outcome status
2. Identify comparable controls (challenging)
3. Retrospectively determine prior exposure
*Big challenge to account for all differences between cases & controls that could explain relationship between exposure & case status
19
Case-Control Example
20
Summary of how research works
RESEARCH QUESTION
TRUTH IN THE UNIVERSE
STUDY PLAN
TRUTH IN THE STUDY
Target population
Women aged 50-69
Phenomena of interest
The proportion who take estrogen
Intended sample
All women aged 50-69 seen in
UCSF primary care clinic in one year
Intended variablesSelf reported
estrogen treatment
Errors
design
infer
Errors
implement
ACTUALSTUDY
FINDINGS IN THE STUDY
Actual Subjects
Actual Measurements
Errors
21
Sampling Errors: Threaten inferences from study subjects to population of interest
• Random error is a wrong result due to chance – unknown sources of variation that are equally likely to distort the sample in either direction.
– If the true prevalence of estrogen treatment in 50-to-69-year-old women is 20%, a well-designed sample of 100 patients from that population might contain exactly 20 patients with this disease. More likely, however, the sample would contain a nearby number such as 18, 19, 21, or 22. Occasionally, chance would produce a substantially different number, such as 12 or 28.
– Reduce the influence of random error by increasing the sample size. The use of a larger sample diminishes the likelihood of a wrong result by increasing the precision of the estimate - the degree to which the observed prevalence approximates 20% each time a sample is drawn.
• Systematic error is a wrong result due to bias (sources of variation that distort the study findings in one direction).
– Using patients who come to the primary care clinic, who might be more likely than average to adopt medical treatments. Increasing the sample size has no effect on systematic error. The only way to improve the accuracy of the estimate is to design the study in a way that either reduces the size of the various biases or gives some information about them. An example would be to draw a second sample of women from a setting that may be less likely to bias the proportion of women treated with estrogen (e.g., employees in a corporation), and to compare the observed prevalence in the two samples.
22
Summary
• Plan ahead!
• We all want to do research that produces valid results, is worthy of publication, and meets with the approval of our peers. This begins with a carefully crafted research question and an appropriate study design.
23
References
• Dallal website
• Hulley, SB, et all. 2001, 2nd ed. Designing Clinical Research, Lippincott Williams & Williams; Philadelphia, PA.
• Wikipedia