Case control and cohort studies

Preview:

DESCRIPTION

 

Citation preview

Epidemiological Study designsCase control & Cohort

Studies

Dr. Deepa GamageConsultant Epidemiologist

Epidemiology•“Study of Distribution and Determinants

of health related state or event in a specified population and the application of this study to the control of health problem”

•Measure ▫Disease frequency ▫Diseases distribution ▫Determinants of disease.

DEFINITIONDescriptive Epidemiology Distribution of disease Who is affected? / Time Geographic location (place) Can provide clues for formulation of epidemiological hypothesis

Analytic Epidemiology Determinants of a disease Testing hypothesis formulated from descriptive studies A particular exposure causes (or prevents) disease occurrence

Research/study design is always gives the answers of following questions What is the study about ?

Why is the study being made ?

Where will the study be carried out ?

What type of data is required ?

Where can the required data be found ?

What periods of time will the study include ? What will the sample design ?

How will the data be analyzed ?

Objectives of a clinical research1. To know the extent of disease problems2. To identify cause/aetiology of disease of interest3. To identify mode(s) of transmission4. To know natural history of the disease5. Formulate preventive and therapeutic measures6. Findings to be used by health educators, policy makers and others

Prospective Data are collected forwards in time from the start of the study

Retrospective Data refer to past events and may be acquired from existing sources, e.g., hospital records/ patient’s notes (BHT) or by interview

Longitudinal Investigate changes overtime possibly in relation to an intervention e.g., CTs - we are interested in the effect of Rx. Commencing at one time, point on outcome at a later time

Cross-sectional Subjects are observed only once Most surveys are cross sectional

Overview of clinical research design

Clinical research

This is based on whether the investigator assigns the exposure or not

Experimental

Observational Experimental

Randomized

Non randomized

TYPES OF EPIDEMIOLOGICAL STUDIES1. OBSERVATIONAL STUDIES

A. DESCRIPTIVE STUDY

DESCRIBE DIESEASE BY

TIME

PLACE

PERSON

B. ANALYTICAL STUDIES

ECOLOGICAL STUDY

CROSS SECTIONAL STUDY

CASE-CONTROL STUDY

COHORT STUDY

2. EXPEREMENTAL STUDIES

RANDOMIZED CONTROLLED TRIAL (RCT)

FIELD TRIAL

COMMUNITY TRIAL

•Observational studies are prone to confounding variables: Variables that mask or distort the association between measured variables in a study

•In an experiment, use random assignments of treatments to avoid confounding variables

Observational Researcher collects information on the attributes or measurements of interest, but does not influence the events

Experimental Researcher deliberately influences events and investigates the effects of the intervention- an investigator-controlled maneuver (such as a drug, a procedure, or a treatment)

DESCRIPTIVE STUDIES Populations (correlational) studies Individuals: Case report Case series Cross-sectional (Baseline study/Prevalence

study) 

ANALYTICAL STUDIES    Observational studies: Case-control Cohort (prospective/retrospective)

Experimental (trials) 

Each descriptive and analytic study design has its unique strengths and limitations

Descriptive & Analytical studies•Descriptive studies:Describe the disease

by Time , Place ,Person

•Analytical studies :Cohort & Case control studies

Create Hypothesis

Test Hypothesis

Classification of types of clinical research

Does the investigator assign/allocate

exposure?

Experimental study

Observational study

Random allocation

Randomized CT

Non-Radomized

CT

Comparison group

Analytical Descriptiv

e

•Cohort •CCS

•Cross sectional

yes no

yes

no

yes no

COHORT STUDIES

Goals•Describe the cohort study design.

•Describe the case-control study design.

•Compare situations in which cohort and case-control study designs

WHAT IS A COHORT•Ancient Roman

military unit, A band of warriors.

•Persons banded together.

•Group of persons with a common statistical characteristic. [Latin]e.g. age, birth date,

What is a Cohort?•A “cohort” is a group of people who have

something in common.•Can represent the source population—the

population from which cases of disease arise.

•Examples of cohorts:▫All employees in an office building▫Everyone who attended a football game▫All the residents of a neighborhood

•Cohort study is undertaken to support the existence of association between suspected cause and disease• A major limitation of cross-sectional surveys

and case-control studies is difficulty in determining if exposure or risk factor preceded the disease or outcome.

• Cohort Study:

Key Point:

▫Presence or absence of risk factor is determined before outcome occurs

INDICATION OF A COHORT STUDY•When there is good evidence of exposure

and disease.

•When exposure is rare but incidence of disease is higher among exposed

•When follow-up is easy, cohort is stable

•When ample funds are available

Design and structure•The main purpose is usually to assess the

possible effects of various exposure factors on the risk of disease

•A cohort study is one that follows a defined group (cohort) over a given period

•The usual approach is to start with healthy or unaffected subjects

Design of a cohort studyIf a positive association exists between the exposure and the disease, we would expect that the proportion of the exposed group in whom the disease develops (incidence in the exposed group) would be greater than the proportion of the nonexposed group in whom the disease develops (incidence in the non- exposed group)

Selection of study populations•There are two basic ways to generate

such groups:▫create a study population by selecting

groups for inclusion in the study on the basis of whether or not they were exposed

▫select a defined population before any of its members become exposed or before their exposures are identified

Design

This type of study design is called a prospective cohort study

Design

This type of study design is called called a retrospective cohort or historical cohort study

Design•In a prospective cohort design, exposure

and nonexposure are ascertained as they occur during the study; the groups are then followed up for several years into the future and incidence is measured.

• In a retrospective cohort design, exposure is ascertained from past records and outcome (development or non development of disease) is ascertained at the time the study is begun.

Potential biases in cohort studies•A number of potential biases must be

either avoided or taken into account in conducting cohort studies. The major biases include the following:▫Bias in assessment of the outcome▫Information bias▫Biases from nonresponse and losses to

follow-up▫Analytic bias

Effect measures in COHORT studies• All measures of effect are based on estimates

of disease risk and/or rates. • Confidence of the effect estimates is given by

a confidence interval, usually at a 95% level. The confidence interval covers the unknown effect measure with 95% probability.

Effect measures in COHORT studies•The data can be displayed in a 2 X 2

contingency table.

ExposureDisease Non-

exposedExposed Total

Healthy a b a + bAffected c d c + dTotal a + c b + d n

Effect measures in COHORT studies•Among the (b + d) exposed there are d

affected subjects.•This gives a risk of Re = d/(b+d). Among

the (a+c) non-exposed, the risk equals R0=c/(a+c).

•The ratio between these two risks will express the risk of the exposed relative to the non-exposed. This ratio, called relative risk, is given as:

cac

dbdRR

Advantages•includes the dimension of time•easy to understand•Prospective cohort studies are assumed to

give a more valid and less biased result than the other study designs

•Both incidence rate and incidence rate ratio can be estimated

•Suitable for addressing risk factors that are stable over time and for assessing diseases that are relatively frequent

Disadvantages•Diseases with a low incidence rate are not

amenable to drawing causal conclusions•Are relatively costly•Usually require long follow-up•The registration of exposure variables can

be inflexible

Case – Control Studies

Learning/Performance Objectives

•To develop an understanding of: ▫What case-control studies are▫The value of such studies▫The basic methodology▫ Advantages and disadvantages of such studies

DESIGN OF A CASE-CONTROL STUDY

To examine the possible relation of an exposure to a certain disease, we identify a group of individuals with that disease (called cases) and, for purposes of comparison, a group of people without that disease (called controls).

DESIGN OF A CASE-CONTROL STUDY•We determine what proportion of the

cases were exposed and what proportion were not

•We also determine what proportion of the controls were exposed and what proportion were not.

FEATURES OF CASE-CONTROL STUDIES

1. DIRECTIONALITY: Outcome to exposure

2. TIMING:Retrospective for exposure, but case-ascertainment can be either retrospective or concurrent.3. SAMPLING: Almost always on outcome, with matching of controls to cases

Case Control Design

Time Direction of

Inquiry

Population

Cases with the Disease

Controls without the disease

Exposed

Exposed

Not Exposed

Not Exposed

Definition….•The case-control study is an analytic

epidemiologic research design in which the study population consists of groups who either have (cases) or do not have a particular health problem or outcome (controls).

•The investigator looks back in time to measure exposure of the study subjects. The exposure is then compared among cases and controls to determine if the exposure could account for the health condition of the cases.

Characteristics•Observational / Non-experimental

•Explanatory /Analytical

•Retrospective

•Effect to Cause

•Both Exposure & Disease have already occurred

•Uses Comparison Group

Decision to conduct case-control research•The characteristics of the exposure and

disease

•The current state of knowledge: Relationship

•The immediate goals of the study

•The research setting

•The resources available

TWO CHARACTERISTICS OF CASES

1. REPRESENTATIVENESS:Ideally, cases are a random sample of all cases of interest in the source population (e.g. from vital data, registry data)

More commonly they are a selection of available cases from a medical care facility

(e.g. from hospitals, clinics)

2. METHOD OF SELECTIONSelection may be from incident or prevalent cases:

• Incident cases are those derived from ongoing ascertainment of cases over time.

• Prevalent cases are derived from a cross-sectional survey.

SELECTION OF CASES

•In a case-control study, cases can be selected from a variety of sources, including hospital patients, or clinic patients

•Several problems must be kept in mind in selecting cases for a case-control study.

Identification of cases: Issues•The goal is to

▫Ensure that all true cases have an equal probability of entering the study and that no false cases enter

▫Example: Conceptual definition of HIV Factors affecting decision to test/access the

test and Sn & Sp of test will decide who eventually becomes a case under operational definition

Selection bias ??

Biases•Selection bias

▫Unequal chance of getting into study•Berkson’s bias

▫Variable rate of hospitalization affecting case selection

•Neyman fallacy▫Incident case Vs prevalent case

•Detection bias▫Due to closer medical attention, detection of

endometrial cancer was more in a group using estrogen

CHARACTERISTICS OF CONTROLS•Who is the best control? •Where should controls come from? •If cases are a random sample of all cases in

the population, then controls should be a random sample of all non-cases in the population sampled at the same time (i.e. from the same study base)

•But if study cases are not a random sample of the university of all cases, it is not likely that a random sample of the population of non-cases will constitute a good control population.

 THREE QUALITIES NEEDED IN CONTROLS

• Key concept: Comparability is more important than

representativeness in the selection of controls

• The control must be at risk of getting the disease.

• The control should resemble the case in all

respects except for the presence of disease

SELECTION OF CONTROLS

•Controls may be selected from nonhospitalized persons living in the community or from hospitalized patients admitted for diseases other than that for which the cases were admitted.

SELECTION OF CONTROLS• Hospitalized Persons as Controls

▫Hospital inpatients are often selected as controls because of the extent to which they are a "captive population" and are clearly identified; it should therefore be relatively more economical to carry out a study using such controls.

Selection of Controls•The controls should come from the

population at risk of the disease▫Men can not be controls for a gynecological

condition▫The controls should be “eligible for the

exposure”•The controls should have same exposure

rate as that of the population from where the cases are drawn

Types of Controls•Hospital or clinic control•Dead control•Controls with similar diseases•Peer or case-nominated (friend/neighbor)

control•Population controls

Hospital controls•Readily available hence commonly used•Main reasons to use hospital controls are

▫To select controls whose referral pattern is similar to cases

▫To obtain similar quality of examination▫For convenience

•May not be representative of the population

Dead controls•Might use dead controls for dead cases•In some situations, this might lead to use

of surrogate informant•The problem is the dead control is not

representative of the living population•McLaughlin compared dead controls with

living controls and noticed that the dead controls smoked more cigarettes and consumed more alcohol than living controls

•Appropriateness depends on the exposure being studied

Controls with similar diseases•Reasons

▫To minimize the recall bias▫To minimize the interviewer bias▫To examine the specificity of an exposure

for a particular type of cancer▫For “practical” but unspecified reasons

Peer or case-nominated (friend/neighbor) control•Neighborhood controls is used in two

ways:▫To refer to community or population controls▫To refer to controls selected from finite number

of close neighbors Search starts from house of the case and door-to-door

search conducted for eligible controls in a standardized pattern

•Friend or neighbor control is a surrogate for matching on age, sex, education, etc▫A quick way to find control▫Bias is introduced if determinants of friendship

are associated with disease or exposure Friends share many risk behaviors

Population controls•Randomly drawn from population•Truly representative of population•Ideal way of selecting controls•Practically, very difficult to carry out

Where to select controls from?•Way the pros and cons•Analyze the situation for bias being

introduced•If possible,

▫select different sources of controls and compare with each other

▫Compare the inferences drawn

Ratio of control to cases •Statistical consideration

▫When the number of subjects available in one group (cases) is limited, an increase in the other group increases the study power

▫Gain in power is till the ratio of 4:1▫Thereafter, the gain is not substantial but

cost increases▫When the study of power with equal

allocation is as high as 0.9 or as low as 0.1, additional fails to increase the power

Ratio of control to cases •Validity of inferences

▫Even when there is no statistical need, more than one control may be recruited per case

▫Enrolling two or more types of controls is a way of checking for biases introduced by choice of control group

▫If the measure of effect is similar when comparing cases with each control group Probably – no biases (no surety) If different measure of effect, then the bias is

there and the researcher can understand it

MATCHING•Matching is defined as the process of

selecting the controls so that they are similar to the cases in certain characteristics, such as age, race, sex, socioeconomic status, and occupation.

•Matching may be of two types:▫group matching▫individual matching.

MATCHING•Group Matching

▫Group matching (or frequency matching) consists of selecting the controls in such a manner that the proportion of controls with a certain characteristic is identical to the proportion of cases with the same characteristic.

MATCHING•Individual Matching

▫In this approach, for each case selected for the study, a control is selected who is similar to the case in terms of the specific variable or variables of concern.

MATCHING•The problems with matching are of two

types:

▫Practical

▫Conceptual

MATCHING•Practical Problems with Matching:

▫If an attempt is made to match according to too many characteristics, it may prove difficult or impossible to identify an appropriate control.

MATCHING•Conceptual Problems with Matching:

▫Once we have matched controls to cases according to a given characteristic, we cannot study that characteristic.

▫We do not want to match on any variable that we may wish to explore in our study.

SIX ISSUES IN MATCHING CONTROLS IN CASE-CONTROL STUDIES 

1. Identify the pool from which controls may come. This pool is likely to reflect the way controls were ascertained (hospital, screening test, telephone survey).

2. Control selection is usually through matching. Matching variables (e.g. age), and matching criteria (e.g. control must be within the same 5 year age group) must be set up in advance.

3. Controls can be individually matched or frequency matched

INDIVIDUAL MATCHING: search for one (or more) controls who have the required MATCHING CRITERIA. PAIRED or TRIPLET MATCHING is when there is one or two controls individually matched to each case.

FREQUENCY MATCHING: select a population of controls such that the overall characteristics of the group match the overall characteristics of the cases. e.g. if 15% of cases are under age 20, 15% of the controls are also.

 4. AVOID OVER-MATCHING. match only on factors known to be causes of the disease.

5. Obtain POWER by matching MORE THAN ONE CONTROL PER CASE. In general, N of controls should be < 4, because there is no further gain of power above four controls per case.

6. Obtain GENERALIZABILITY by matching more than ONE TYPE OF CONTROL

PROBLEMS OF RECALL•A major problem in case-control studies is

that of recall.•Recall problems are of two types:

▫Limitations in recall

▫Recall bias

PROBLEMS OF RECALL Limitations in Recall

Virtually all human beings are limited to varying degrees in their ability to recall information, limitations in recall are an important issue in such studies.

If a limitation of recall regarding exposure affects all subjects in a study to the same extent, regardless of whether they are cases or controls, a misclassification of exposure status may result.

PROBLEMS OF RECALL•Recall Bias

▫ A more serious potential problem in case-control studies is that of recall bias.

▫ The small number of examples available could reflect infrequent occurrence of such bias, but the possibility for such bias must always be kept in mind.

DESIGN OF A CASE-CONTROL STUDY

CASES CONTROLS

Case-Control Studies: Methodology

Then measure, Exposure status

First Select the Cases and Controls

Cases(with disease

Control (without disease)

Exposed A B

None Exposed C D

A+C B+DPopulation Exposed A/A+C B/B+D

Tests of significance•Unmatched study•Matched study

Effect measures in CASE-CONTROL studies•The odds ratio (OR) is used as an effect

measure for association.

•Confidence of the effect estimates is given by a confidence interval, usually at a 95% level. The confidence interval covers the unknown effect measure with 95% probability.

Effect measures in CASE-CONTROL studies•The data can be displayed in a 2 X 2

contingency table.

Effect measures in CASE-CONTROL studies

The exposed subjects (b + d), d will be classified as ill or affected. The odds for being ill consequently is d/(b + d) divided by b/(b + d) or d/b, which is the same as the probability of being ill divided by the probability of not being ill.

In the group of the (a + c) non-exposed, c are counted as ill or affected and a as healthy. The odds for being ill versus healthy is c/a.

Effect measures in CASE-CONTROL studies• The odds ratio is a comparison of these two

groups by dividing the odds for both groups, giving:

bcad

acbdOR

The odds ratio is an effect measure that tells us how much larger the odds are for the exposed to be ill than for the nonexposed.

MORE POINTS ABOUT CASE-CONTROL ANALYSIS

• The odds ratio is a good estimate of the relative

risk when the disease is rare (prevalence < 20%).

• Can be extended to N > 1 controls.

• statistical testing is by simple chi-square

(unmatched analysis) or by McNemar’s chi square

(matched-pairs analysis).

• Can be extended to multiple strata (Mantel-

Haenzel chi-square)

Matched dataExposure to fumes

Headache present

Headache absent

Total

Factor present A B A+BFactor absent C D C+DTotal A+C B+D A+B+C+D

•A,B,C,D are number of pairs•OR = B/C•SE (OR) = e Sqrt (1/B+1/C)

•CI = OR exp (± Z 1-/2

Sqrt (1/B+1/C))

•Association by McNemar’s test = (B-C)2/(B+C)

•Regression = Conditional regression

INTERPRETATIONOR=1, OR<1, OR>1OR Range Interpretation0.0 - 0.3 Strong Benefit0.4 - 0.5 Moderate Benefit0.6 - 0.8 Weak Benefit0.9 - 1.1 No Effect1.2 - 1.6 Weak Hazard1.7 - 2.5 Moderate Hazard> 2.6 Strong Hazard

• Easy to carry out• Rapid (less time consuming)• Less expensive• Useful for rare diseases• Useful for diseases with a long latent interval• No risk to subjects• Multiple exposures can be studied• No attrition problem

ADVANTAGES

• Susceptible for biases• Selection of controls difficult• Incidence (thereby RR) can not be calculated• If disease is relatively common (> 5 to 10%), OR

may not be reliable estimate of RR• Other possible effects of exposure can not be

studied

DISADVANTAGES

APPLICATIONS

• Evaluating Vaccine Effectiveness• Evaluations of Treatment & Program Efficacy• Evaluation of Screening• Outbreak Investigations• Indirect Estimation in Demography• Genetic Epidemiology• Occupational Health Research• Predictive Modeling

SOME IMPORTANT DISCOVERIES MADE IN CASE CONTROL STUDIES

 1950's•Cigarette smoking and lung cancer

1970's•Diethyl stilbestrol and vaginal

adenocarcinoma•Post-menopausal estrogens and endometrial

cancer

 1980's•Aspirin and Reyes syndrome•Tampon use and toxic shock syndrome•L-tryptophan and eosinophilia-myalgia syndrome•AIDS and sexual practices

1990's•Vaccine effectiveness•Diet and cancer

Principal differences between case-control studies and cohort studies

Case – Control Studies

Cohort Studies

Population Start with affected subjects

Start with healthy subjects

Incidence No YesPrevalence No NoAssociation Odds ratio Relative risk

Recommended