Upload
silvester-gilbert
View
228
Download
2
Tags:
Embed Size (px)
Citation preview
1
Causal Directed Acyclic Graphs (DAG)(Causal Diagrams)
2013
Eyal Shahar, MD, MPHProfessor
2
What is a causal diagram?
Components Variables Unidirectional arrows
A C
D
E
B
3
Rules: displaying variables
Called “nodes” or “vertices”
Should be clearly understood by others
Variables, not values of variables “Smoking status” is okay; “smoking” is not
Displayed along the time axis (left to right) but sometimes we ignore this rule
4
Rules: drawing arrows
An arrow From a postulated cause to its
postulated effect
No bidirectional arrows
An arrow with a question mark The research question at hand
An arrow without a question mark Background theory or axiomatic
A B
A B
A B
C
?
5
Rules: drawing arrows
Directed Acyclic Graph Circularity does not exist A future effect cannot be a
cause of its cause in the past
So-called “circularity” Directed acyclic graph with
time-indexed variables
A B C
At=1 Bt=2 At=3 Bt=4
6
R1
D1
S1
D2
T
R2
D1dx D2
dx
S2
I1 I2R=reflux
S=symptoms
T=treatment
I=imaging
D=esophagus status
Ddx=diagnosed esophagus status
Example: a causal diagram for gastroesophageal reflux and esophageal disease
?
7
How does a causal diagram help in research?
Decodes causal assertions All of science is about causation!
Clarifies our wordy or vague causal thoughts about the research topic
Connects “association” with “causation”
Helps us decide which covariates should enter the statistical model—and which should not
Unifies our understanding of confounding bias, colliding bias, information bias (and three other, less well, known biases)
Can depict and explain all types of bias
8
PubMed search (through 2012)
“Causal diagrams”: 83 titles
“Directed acyclic graph”: 137 titles (some irrelevant)
Still not widely known
Rarely used
9
Some references
Pearl J. Causality: models, reasoning, and inference. 2000. Cambridge University Press (2009, second edition)
Greenland S et al. Causal diagrams for epidemiologic research. Epidemiology 1999;10:37-48
Robins JM. Data, design, and background knowledge in etiologic inference. Epidemiology 2001;11:313-320
Hernan MA et al. A structural approach to selection bias. Epidemiology 2004;15:615-625
Shahar E, Shahar DJ. Causal diagrams, information bias, and thought bias. Pragmatic and Observational Research 2010:1;33-47
Shahar E, Shahar DJ: Causal diagrams and three pairs of biases. In: Epidemiology –Current Perspectives on Research and Practice (Lunet N, Editor). www.intechopen.com/books/epidemiology-current-perspectives-on-research-and-practice, 2012:pp. 31-62 (reading material for this module)
10
A natural path between two variables
Formally: a sequence of arrows, regardless of their direction, that connects two variables (and does not pass more than once through each variable)
Informally: “can walk from A to Z, or from Z to A, on bridges”
A B C D E Z
A B C D Z
A B C D Z
A Z
11
Types of natural paths between two variables
Causal paths Confounding paths Colliding paths
12
A causal path between two variables (also called “directed path”)
A natural path between A and Z, in which all the arrows point in the same direction (hence, “directed path”)
“A is a cause of Z” or “Z is a cause of A
A B C Z
A ZA Z
A B Z
A B Z
C D
13
“Direct” versus “indirect” causal path
“Direct” is often (maybe always) over-simplification Is it really direct? No intermediary exists?
Better terminology: “causal paths in which no intermediary variables are known or displayed”
Overall (total) effect: by all directed paths (combined)
A B Z
“direct” causal path
Indirect causal path
14
A confounding path between two variables
A natural path between A and Z that contains a shared cause of A and Z on this path (a confounder)
A Z
CX
A Z
C
A C Z A C X Z
Alternative display
15
A colliding path between two variables
A natural path between A and Z that contains at least two arrowheads that “collide” at some variable along this path (a collider on the path)
A Z
A ZC
K
L
M
A C Z A K M Z
Alternative display
L
Side point: collider (and confounder) are path-specific terms
1616
A variable called a collider (or a confounder) on one path need not be a collider (or a confounder) on another path
A
B D
Z
C
C is a collider on one path (ABCDZ) and a confounder on another path (ACZ)
Identify and name each natural path between A and Z
17
P
A Z
K
Q
L M
R
S
18
A bridge to “association”
What is “association”? Mathematical phenomenon Ability to guess the value of one variable based on the value of
another variable
Are there “spurious associations”? Mathematical relation between variables is never “spurious” Poor word choice “The association of A with Z is spurious.” What does the writer have
in mind, though?
What creates associations? A causal structure
19
A bridge between natural paths and associations
Which natural paths between A and Z contribute to the marginal (crude) association between A and Z?
Causal paths
Confounding paths
Which natural paths between A and Z do not contribute to an association between A and Z?
Colliding paths
Open paths
Blocked paths
20
Identify open paths and blocked paths (between A and Z) in this diagram
A C Z
B
A Z
B
A Z
Open paths
A C Z
B
A C Z
Blocked paths
A C Z
B
D
A Z
D
21
When does an association between A and Z reflect the effect of A on Z?
When only causal paths contribute to the association between A and Z
When confounding paths do not exist, or are somehow blocked Almost true: not a sufficient condition
22
How do we block a confounding path?
By conditioning on some variable along the path
What is “conditioning” on a variable?
Restricting the variable to one of its values Various forms of “adjustment”
Standardization Stratification and a weighted average (Mantel-Haenszel) Adding an independent variable to a regression model
23
Conditioning on a variable…
Dissociates a variable from its causes and its effects
Turns an open natural path into a blocked path
A
B
C
X
Y
Z
V
A V Z A V Z
24
Deconfounding = blocking a confounding path
A Z
C
A Z
C
X
A Z
C
XBut what if?
??
?
25
Induced paths Conditioning on a collider creates (or contributes to)
the association between the colliding variables
A Z
A ZC
K
L
M
Why? Later…
26
Induced paths
An induced path may contain Only dashed lines Dashed lines and arrows Colliders
An induced path may be blocked or open An induced path is blocked
If there is at least one collider on the path An induced path is open
If there are no colliders on the path
27
Blocked induced paths
A
B
C E Z
D
Blocked natural path
A
B
C E Z
D
Blocked induced path
A
B
C E Z
D
Blocked natural path
A
B
C E Z
D
Blocked induced path
28
Open induced paths
Blocked natural path Open induced path
A B Z
C
A B Z
C
Blocked natural path Open induced path
A
B
C E Z
D
A
B
C E Z
D
29
Confounding bias and colliding bias
A confounding path contributes to the (marginal) association between A and Z This unwanted contribution is called confounding bias
An open induced path contributes to the (conditional) association between A and Z This unwanted contribution is called colliding bias
30
Can we block an open induced path? --Yes
A B Z
C
A
B
C E Z
D
We can eliminate these paths by conditioning on C
Open induced paths
A B Z
C
A
B
C E Z
D
31
Key questions
Why does a collider block a path? Why don’t we observe an association between colliding variables?
Why does conditioning on a collider create an association between the colliding variables?
Blocked path Open induced path
A
C
Z A
C
Z
32
Intuitive explanation
A sample of N patients Variables
M: meningitis status (yes, no) S: stroke status (yes, no) V: vital status (alive, dead)
Assume: causal reality is fully described in the diagram
M
V
S
33
Is there a marginal (crude) association between meningitis status and stroke status?
No, we cannot guess stroke status from meningitis status (or vice versa)
Intuition: a common effect (vital status) cannot induce an association between its (past) causes
There is no transfer of guesses across a collider A colliding path is a blocked path
34
Suppose we condition on V (vital status)…
Stratum 1 (V=alive) Stratum 2 (V=dead)
Alive patients Dead patientsPt
Stroke status
Vital status
Meningitis status
1 No Alive ?
PtStroke status
Vital status
Meningitis status
2 No Dead ?
My guess: “No” My guess: “Yes”
We can make some guesses after conditioning M (meningitis status) and S (stroke status) are
associated within the strata of V (the collider)
35
Before and after conditioning…
Blocked path Open induced path
M
V
S M
V
S
36
Theorem and implications
Theorem Colliding variables will be associated within at least one
stratum of their collider
Implications a Mantel-Haenszel summary measure of association will
differ from the crude, if we summarize across a collider A regression coefficient will change if we “adjust” for a
collider
37
Goal: estimate a measure of effect (causation) by a measure of association
Association is estimating causation (AZ) when: The association between A and Z is due only to AZ
direct and indirect paths combined
Methods Display variables and causal assumptions in a causal diagram Block all confounding paths between A and Z Do not create open induced paths between A and Z
or eliminate them, if created
38
Confounding bias (again)
The most widely known Historical definitions and identification methods
“Lack of exchangeability” “Mixed effects” “Non-collapsibility” “Change-in-estimate”
A fair amount of confusion
A Z
C
?
The basic causal structure
39
So what is a confounder? A confounder is a common cause of the exposure (A) and the
disease (Z)
A B C D Z
Confounder
A
B
C
D
Z
Note: we can block the path by conditioning on B or C or D.
40
Endless complexity
E0 E1
D1
E−1
D2D0
E−2E−3
D−1D−2
Q0Q−1Q−2Q−3
Exposure: E0 (baseline exposure)Disease: D2 (follow-up)Question: Which is the confounder?
41
Colliding bias Formerly known as “selection bias” Confusing names and types
“No representativeness” “Biased sample” “Convenient sampling” “Control-selection bias” “Survival bias” “Informative censoring”
The basic causal structure
A Z
C
?
42
But there are many more versions
A Z
X Y
C
? A Z
X
C
?
A Z
X
C
? A Z C
43
Confounder versus collider
A Z
Confounder
Collider
44
A
confounding bias and colliding bias: an antithetical pair
Z
C A Z
C?
?
A Z
C
?
A Z
C
?
Bias No bias
BiasNo bias
Confounder Collider
45
Even more impressive in text…
Confounder ColliderMain attribute common cause common effect
Association contributes to the association between its
effects
does not contribute to the association between its causes
Type of path open path blocked path
Effect of conditioning
blocked path open path
Bias before conditioning?
Yes, confounding bias
No
Bias after conditioning?
No Yes, colliding bias
46
What is selection bias?
A type of colliding bias
Should be called “sampling colliding bias”
47
Types of colliding bias
Sampling colliding bias Every study is restricted to selected people Inevitable conditioning on “selection status” (S) Sometimes, this unavoidable conditioning creates colliding bias
Analytical colliding bias Restricted analysis: computing association for one stratum of a
collider Stratified analysis: computing association for each stratum of a
collider Adjustment by analysis
Computing a weighted average across the collider Adding the collider to a regression model, as a covariate
48
Sampling colliding bias: a wrong sampling decision
What happens if we estimate the effect of marital status (A) on dementia status (Z) in a sample of nursing home residents? Restricting recruitment to nursing home residents
Assumptions No effect of A on Z Both variables affect “place of residence” (P)
(nursing home or elsewhere)
49
Causal diagrams
(marital status)
A
(dementia status)
Z
P
(marital status)
A
(dementia status)
Z
S S
P P P
(Selection status)
50
Sampling colliding bias: a wrong sampling decision
What happens if we estimate the effect of coughing status (A) on abdominal pain status (Z) in a sample of hospitalized patients? Restricting recruitment to hospitalized patients
Assumptions Displayed in the diagram (next slide) H is hospitalization status
51
Causal diagram
pneumonia status
ulcer status
A (coughing
status)
Z (abdominal pain
status)
?
HH H
H
S
52
Basic causal diagrams for every case-control study
The key feature of a case-control study Disease status affects selection into the case-control sample Diseased people are much more likely to be selected than disease-free
people
A Z
S(selection status)
?A Z
S
?
No bias, unless we mistakenly create an open path between A and S!
53
Sampling colliding bias: a wrong sampling decision
Research question: What is the effect of smoking status (A) on cancer status (Z)?
Design: Hospital-based case-control study
Controls: patients with cardiovascular disease (CVD)
54
Causal diagram: smoking and cancer
(smoking status)
A
(cancerstatus)
Z
CVD status
S
Sampling decision for controls
?
Always exists in a case-control study
Note: CVD and Z collide at S
Background knowledge
55
Colliding bias (AKA control selection bias)
(smoking status)
A
(cancerstatus)
Z
S
?
CVD status
56
Sampling colliding bias:Willingness to participate in a case-control study
(smoking status)
A
(cancerstatus)
Z
Willing to participate?
S
?
Background knowledge
57
Control (or case) selection bias
Two main mechanisms
A Z?
S
B
A Z
SB Sampling/participation
of controls (or cases)
Remember: ZS always exists We always condition on S
Sampling/participationof controls (or cases)
58
Types of colliding bias
Sampling colliding bias Every study is restricted to selected people Inevitable conditioning on “selection status” (S) Sometimes, this unavoidable conditioning creates colliding bias
Analytical colliding bias Restricted analysis: computing association for one stratum of a
collider Stratified analysis: computing association for each stratum of a
collider Adjustment by analysis
Computing a weighted average across the collider Adding the collider to a regression model, as a covariate
59
Analytical colliding bias: restricted analysis
Research question: what is the effect of dietary fibers on colon polyp?
Design: a cross-sectional study
Analysis: restricted to people who have not developed yet colon cancer
60
Causal diagram
(Dietary fibers)A
(Colon polyp status) Z
Colon cancer status
?
Note: A and Z collide at colon cancer status
Assumed knowledge Assumed knowledge
61
Analytical colliding bias
(Dietary fibers)A
(Colon polyp status) Z
Colon cancer status
?
Despite “intuition” we should not restrict the sample tocancer-free people
62
Analytical colliding bias: adjustment
“We adjusted for everything, but the kitchen sink”
Traditional steps Add a laundry list of covariates to the regression model See what happens to the exposure coefficient Use the “change-in-estimate” method
Change in the coefficient = Evidence for confounding Report the “adjusted” coefficient as a better (less
confounded) measure of effect
Prone to colliding bias
63
Analytical colliding bias
Research question: what is the overall effect of gender on blood pressure?
Design: a cross-sectional study
Analysis Crude mean difference in systolic blood pressure “Adjusted” mean difference (conditioned on waist
circumference)
64
Results
Analysis Mean SBP men
(mmHg)
Mean SBP women (mmHg)
Mean difference (mmHg)
Crude 123.8 122.1 1.7
“Adjusted” for waist
circumference
-3.1
Why do the estimates differ? Which estimate should be reported? Is the adjusted estimate less biased?
65
Is abdominal fat (measured by waist circumference) a confounder?
(abdominal fat)C
A (Gender)
Z (Blood pressure)
No!
66
Revised diagram(abdominal fat)
C
A (Gender)
Z (Blood pressure)
No need to “adjust for” abdominal fat “Adjustment” could have:
blocked a causal path created colliding bias
67
Could have blocked a causal path…
(abdominal fat)C
A (Gender)
Z (Blood pressure)
68
Could have created colliding bias…
U
(abdominal fat)C
A (Gender)
Z (Blood pressure)
69
Advice on multivariable regression
Do not adjust for an effect of the exposure
Do not adjust for an effect of the outcome
Select covariates according to theory (causal diagram), not mechanistically (change in estimate, stepwise regression)
“Every variable is adjusted for all others” is almost always false Confounding is not a reciprocal property
70
Key points
The essence of epidemiology (and all of science) is causal theories
Your theories (about causation) are not “A is associated with Z” “Possessing a cigarette lighter is associated with lung
cancer” is true, but who cares? That’s not causal knowledge
Your theories about bias are not “intuition” about bias; they are causal theories, too.
Almost every theory in science is about causation, which means an arrow between variables
71
Key points
Magnitude of bias is more important than merely its presence Small bias may be ignored Magnitude of bias may be difficult to estimate
The bias-variance tradeoff