Upload
blake-cunningham
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Study Design in Molecular Epidemiology of Cancer
Epi243
Zuo-Feng Zhang, MD, PhD
Objectives of Molecular Epidemiology
To gain knowledge about the distribution and determinants of disease occurrence and outcome that may be applied to reduce the frequency and impact of disease in human populations.
Epidemiological Study Design and Analysis
• Transitional studies provide a bridge between the use of biomarkers in laboratory experiments and their use in cancer epidemiological studies.
• The study is employed to characterization of biomarkers
• The problem of the use of biomarkers• Serve as preliminary results rather than end results
about cancer etiology and prevention
Epidemiological Study Design and Analysis
Transitional studies:• Measure Intra- and inter-subject variability• Explore the feasibility of marker use in field
condition• Identify potential confounding and effect-
modifying factors for the marker• Study mechanisms reflected by the
biomarker
Transitional Studies
Transitional studies can be divided into three functional categories:
• Developmental
• Characterization
• Applied studies
Transitional Studies: Developmental Studies
Developmental studies involved• determining the biological relevance• pharmacokinetics• reproducibility of measurement of the marker• the optimal conditions for collecting, processing,
and storing biological specimens in which the marker is to be measured
Transitional Studies: Characterization
Assessing inter-individual variation and the genetic and acquired factors that influence the variation of biomarkers in populations
Transitional Studies: Characterization
• Assessing frequency or level of a marker in populations
• Identifying factors that are potential confounders or effect modifiers
Transitional Studies: Characterization
• Establishing the components of variance in biomarker measurement, laboratory variability, intra-individual variation, and inter-individual variation. The ratio of intra-individual variation to inter-individual variation has important implications for study size and power
Transitional Studies: Applied Studies
• The applied studies assess the relationship between a marker and the event that it marks, including exposure, pre-clinical effects, disease, and susceptibility
• The study is usually cross-sectional or short term longitudinal design and not intended to establish or refute a causal relationship between given exposure and disease.
Transitional Studies: Ethical Issues
• The objectives of the research generally are not to identify health risks, but to identify characteristics of the biomarker or the distribution of the marker by population subtypes.
• The meaning of the biomarker results is usually unknown.
• There is a need to anticipate the impact of transitional studies on study subjects and plan to address their concerns.
Cohort or Case-Control Studies
• In the clinical-based cohort studies, of treated patients or screened populations, the inclusion of biological measures of exposure and susceptibility is both methodologically sound and logistically feasible
Cohort or Case-Control Studies
• In population-based studies, the collection of biological material for such markers is feasible but logistically more complex.
• For early biological marker, collection of materials (e.g., pre-cancerous lesions) is logistically feasible in a hospital setting, but become more difficult in the population setting
Prospective Studies: Strengths
• Exposure is measured before the outcome
• The source population is defined
• The participation rate is high if specimen are available for all subjects and follow-up is complete
Prospective Studies: Weaknesses
• The usually small number of cases of each of many type of cancer
• The lack of specimen if the biomarker requires large amounts of specimen or unusual specimens
• Degradation of the biomarkers during long-term storage
• The lack of details on other potentially confounding or interacting exposures
Prospective Studies
• The major concern of cohort studies of the short duration (as in case-control studies) is the possibility that the disease process has influenced the biomarker level among cases diagnosed within 1 to 2 years of the specimen being collected.
Prospective Studies: Misclassification
• In prospective studies in longer duration, there may be considerable misclassification of the etiologically relevant exposures if the specimens have been collected only at baseline.
• This misclassification occurs when individual’s exposure level may change systematically over time and there may be intra-individual variation in biomarker level.
Prospective Studies: Intra-Individual Variation
• The intra-individual misclassification may be reduced by taking multiple samples, but this will generally increase expenses of sample collection and storage and the burden on study subjects
• Similar approaches apply to taking sample at several points in time in an attempt to estimate time-integrated exposures or exposure change.
Prospective Studies
• An alternative approach is to estimate the extent of intra-individual variation, and the misclassification involved in taking single specimens, by taking multiple specimens in a sample of the cohort.
• This information can be used to correct for bias to the null introduced if the misclassification is non-differential, and therefore de-attenuate observed relative risks
Prospective Studies: Ethical Issues
• Repeated contact of subjects
• Informing the cohort members of their biomarker level is problematic if the biomarker is not considered to be sufficiently predictive of disease and if there is no preventive steps cohort members can take to reduce their risk of the disease
Nested Case-Control Study
• The biomarker can be measured in specimens matched on storage duration
• The case-control set can be analyzed in the same laboratory batch, reducing the potential for bias introduced by sample degradation and laboratory drift
Case-Cohort Study Design
• Collecting the specimens at the baseline for entire cohort and then collecting specimens from cases as they occur.
• Measuring the biomarker using newly collected specimen and using the baseline cohort specimen as control.
• Because the specimens for cases and controls are taken at the different times for cases and controls, bias will be introduced if sample degradation or lab drift occurs over time
Case-Control Study Design
• For genetic susceptibility markers, case-control study design is highly appropriate
• Clinic-based case-control studies are particularly suitable for studies of intermediate endpoints, as these end-point can be systematically measured.
• Clinic-based case-control studies are excellent for studying etiology of precancerous lesions (e.g., CIN)
Case-Control Study Design
• Biomarkers of internal dose (e.g., carrier status for infectious agents, such as HBsAg) or effective dose (PAH DNA adducts) are appropriate when they are stable over a long period of time or when the exposures have been constant over exposure period. However, it is essential that you are not affected by the disease process, diagnosis, or treatment.
The Case-Case Design:
Applications in Tumor Markers and Genetic Polymorphisms
Studies
Case-Case Study Design
• To identify etiological heterogeneity
• To evaluate gene-environment interaction
Case-Case Study Design
• Case-only, Case-series, etc.
• Studies with cases without using controls
• Can be employed to evaluate the etiological heterogeneity when studying tumor markers and exposure
• May be used to assess the statistical gene-environment or gene-gene interactions
Interaction Assessment using Case-Control Study
Genotype abnormal OR1
Genotype normal OR2
Interaction measure OR1/OR2
here OR2=OR01
OR1=OR11/OR10
OR Interaction= OR11/(OR10xOR01)
Comparison of Case-Control and Case-Case Study designs
Parameter Case-control Case-Case
Beta(01) OR01 Not measured
Beta(10) OR10 Not measured
Beta interaction
ORint=
OR11/OR01xOR10
Measured
Beta (11) OR11=OR01 x OR10 x ORint
Not measured
Assumptions for Case-Case Study Design
• Exposure and genotype occur independently in the population
• The Risk of disease is small (or the disease is rare) at all level of the study variables
From Rothman & Greenland, p.615
Smoking and TGF-alpha Polymorphism
Smoking TGF-B Case Control OR adj.
Never Normal 36 A00 167 B00 1.0 OR00
Never Positive 7 A01 34 B01 1.0 OR01
Yes Normal 13 A10 69 B10 0.9 OR10
Yes Positive 13 A11 11 B11 5.5 OR11
OR int= OR11/(OR01 x OR10) = 5.5/(1.0 x 0.9)=6.1
OR CA=(A11 x A00)/(A10 x A01)=(13 x 36)/(13 x 7)=5.1
OR int=OR CA/OR CO=[OR 11/(OR01xOR10)]
OR11=A11 B00/A00 B11
OR CA = [OR 11/(OR01xOR10)] x OR CO
Assumption: OR CO=1,
OR int = OR CA
Sample Size
Main effect Interaction
Case-control (RR) 2.0 (RR) 2.0
Sample size 150 cases
150 controls
600 cases
600 controls
Case-Case 300 cases
Strengths of Case-Case Study Design
• Case-Case study design offers greater precision for estimating gene-environment interaction than case-control study design
• The power for detecting gene environment interactions in case-case study is comparable to the power for assessing a main effect in a classic case-control study. Which leads to reduced sample size for interaction assessment.
Strengths of Case-Case Study Design
• Only cases are needed, thus avoiding the difficulties and often unsatisfying selection of appropriate controls (avoiding selection bias for controls)
Limitations of Case-Case Study Design
• The main effects of susceptible genotype (G) and environment effect (E) cannot be estimated
• The case-case study will miss gene-environment models with departures from additivity.
Intervention Studies
• In studies of smoking cessation intervention, we can measure either serum cotinine or protein or DNA adducts (exposure) or p53 mutation, dysplasia and cell proliferation (intermediate markers for disease)
• Measure compliance with the intervention such as assaying serum -carotene in a randomized trial of -carotene.
Intervention Studies
Susceptibility markers (GSTM1) can also be used to determine whether the randomization is successful (comparable intervention and control arms)
Family Studies
• Does familial aggregation exist for a specific disease or characteristic?
• Is the aggregation due to genetic factors or environmental factors, or both?
• If a genetic component exists, how many genes are involved and what is their mode of inheritance?
• What is the physical location of these genes and what is their function?
Issues in Study Design and Analysis
• Relating a particular disease (or marker of early effect); to a particular exposure; while minimizing bias; controlling for confounding; assessing and minimizing random error; and assessing interactions
Sample Size and Power Consideration
EPI243: Molecular Epidemiology of Cancer
Sample Size and Power
• False positive (alpha-level, or Type I error). The alpha-level used and accepted traditionally are 0.01 or 0.05. The smaller the level of alpha, the larger the sample size.
Sample Size and Power
• False negative (beta-level, or Type II error). (1-beta) is called the power of the study. Investigator like to have a power of around 0.80 or 0.95 when planning a study, which means that there have a 80% or 95% chance of finding a statistically significant difference between study and control groups.
Sample Size and Power
• The difference between study and control groups (delta). Two factors need to be considered here: one is what difference is clinically important, and the another is what is the difference reported by previous studies.
Sample Size and Power
• Variability. The more the variability of the data, the bigger the sample size.
Power or Sample Size Estimate for Case-Control Studies
• Alpha-level (false positive): 0.05
• Beta-level (false negative level; 1-beta=power): 0.20
• Delta-level: Proportion of exposure in controls and exposure in cases or expected odds ratio
Power Estimate
Sample Size Estimate
N vs P0 by OR with Phi=0.20 M=1 Alpha=0.05Power=0.80 M.C.C. Test
1.50
2.00
2.50
N OR
P0
0
500
1000
1500
2000
2500
0.0 0.1 0.2 0.3 0.4 0.5
Estimate Minimum Detectable Odds Ratios
OR vs P0 by N with Phi=0.20 M=1 Alpha=0.05Power=0.80 M.C.C. Test
200
400
600
OR N
P0
1
3
5
7
9
0.0 0.2 0.4 0.6 0.8 1.0
Gene-Environment (Gene-Gene) Interaction
EPI242: Molecular Epidemiology
Zuo-Feng Zhang, MD. PhD
Definition for Interaction
• Interaction (effect modification) occurs when the estimate of effect of exposure depends on the level of other factor in the study base.
• Interaction is distinct from confounding (or selection or information bias), but rather a real difference in the effect of exposure in various subgroup that may be of considerable interest.
Interaction Assessment
Factor A
Absent Present
Factor A Absent RR00 RR01
Present RR10 RR11
Interaction Assessment
• RR00, relative risk when both factors absent
• RR01, relative risk when factor A present only
• RR10, relative risk when factor B present only
• RR11, relative risk when both factors A & B present
Interaction Assessment
• Combined RR = RR11
RR11 > RR01 x RR10 indicating more than multiplicative interaction
or RR11/RR10 >or < RR01/RR00
or RR11/RR01xRR10 > or < 1
• Interaction RR = RR11 / (RR01 x RR10)
Odds Ratios for two factors,Interaction?
Factor B
absent present
Factor A absent 1.0 2.5
present 4.0 10.0
No more than multiplicative interaction
• ORs for factor B: 2.5 when factor A present; 2.5 (10.0/4.0) when factor A absent
• ORs for factor A: 4.0 when B absent and 4.0 (10.0/2.5) when factor B present
Odds Ratios for two factors,Interaction?
Factor B
absent present
Factor A absent 1.0 2.5
present 4.0 20.0
More than Multiplicative Interaction, Positive Quantitative
Interaction
• ORs for factor B: 2.5 when factor A absent; 5.0 (20.0/4.0) when factor A present
• ORs for factor A: 4.0 when B absent and 8.0 (20.0/2.5) when factor B present
Odds Ratios for two factors,Interaction?
Factor B
absent present
Factor A absent 1.0 2.5
present 4.0 5.0
More than Multiplicative Interaction, Negative
Quantitative Interaction
• Both factors increase the risk regardless of the value of the other factor, but the combined effect is less than the product of the two, although greater than that of either factor alone, giving a negative quantitative interaction.
Odds Ratios for two factors,Interaction?
Factor B
absent present
Factor A absent 1.0 2.5
present 4.0 4.0
More than Multiplicative Interaction, Negative
Quantitative Interaction
• Both factors increase the risk
• When A is present, there is no additional effect of factor B
• Adding factor A to factor B, only increases the risk to the degree found for factor A alone (4.0), leading to negative quantitative interaction.
Sample Size Consideration for Interaction Assessment
• Evaluation of interaction requires a substantial increase in study size. For example, in a case-control study involves comparing the sizes of the odds ratios (relating exposure and disease) in different strata of the effect modifier, rather than merely testing whether the overall odds ratio is different from the null value of 1.0.
Sample Size Consideration
• The power to test interaction depends on the number of cases and controls in each strata (of the effect modifier) rather than overall numbers of cases and controls.
• When considering possible interactions, the size of the study needs to be at least four time larger than when interaction is not considered (Smith and Day)