Upload
nguyennhi
View
225
Download
1
Embed Size (px)
Citation preview
Introduction to DAGs Directed Acyclic Graphs
Metalund and SIMSAM EarlyLife
Seminar, 22 March 2013
Jonas Björk
E-mail: [email protected]
(Fleischer & Diez Roux 2008)
Introduction to DAGs
• Basic terminology and principles
• Classification of bias in the DAG framework
• Concerns and limitations
• Examples from analysis of
neighbourhood health effects
DAG – a logical system for
causal relationships Development of Western science is based on
two great achievements:
1. the invention of the formal logical system
by the Greek philosophers
2. the discovery of the possibility to find out causal
relationships by systematic experiment during the
Renaissance
(Albert Einstein, 1953; adopted from Pearl 2009)
Directed Acyclic Graphs (DAGs)
• Causal diagrams - visualize causal (structural)
relationships between variables
• Based on mathematical theory and reasoning
• Used in
– Epidemiology, social science
– Computer science, artificial Intelligence
– Economics, Business administration
– Cognitive science
– ...
• Minimize bias - identify appropriate (and
inappropriate) analytical strategies
DAGs - nodes and arrows
• Nodes represent variables
– Measured and unmeasured
– Observable and unobservable Structural Equation Models (SEMs);
• Directed arrows (single-headed)
show direct causal effects
(Hernan, Epidemiology 2004)
DAG Language • Direct effect (only one)
– E affects D directly if there is an arrow from E to D
E D
• Indirect effect (can be more than one) – E affects D indirectly if there are a sequence of
directed arrows starting in E and ending in D
E M D
• Children – Variables directly affected by E
– Descendants: directly or indirectly affected by E
• Parents – Variables that directly affect E
– Ancestors: all variables that affect E directly or indirectly
DAG Language - Example • Direct effect
• Indirect effect
• Children, Descendants
• Parents, Ancestors
(Fleischer & Diez Roux 2008)
Acyclic graphs
E0
D
Loops not allowed:
Temporal associations can be depicted
in the following way:
E
E1 D
Time moves from left to right
in the graph
Paths (E – Z – D) • Path
– Sequence of arrows connecting two variables,
regardless of the direction of the arrows
– E → Z ← D
– E → Z → D
– E ← Z → D
– E ← Z ← D
• Collider (common cause within a path)
– Variable Z in a path that has two arrows pointing into it
– E → Z ← D
– Blocks (breaks) the information chain between E and D
• Unblocked backdoor path from E to D
– Begins with arrow pointing into E
– Ends with arrow pointing into D
– Does not contain a collider
This is the
origin of
confounding
DAG Language - Example • Path
• Collider
• Unblocked backdoor
path
(Fleischer & Diez Roux 2008)
Common cause
(Hernan, Epidemiology 2004)
If we want to illustrate the E-D association,
all common causes must be included,
otherwise the DAG is not considered causal
Common effect Common consequence
Collider on the path between E and D
• Conditioning (“knowing the value of”)
– Restriction
– Stratification
– Matching
– Adjustment
(Hernan, Epidemiology 2004)
Creates an association
between E and D
DAG Language - Example • Common cause
• Common effect
• Conditioning
(Fleischer & Diez Roux 2008)
Bias
• Structural association between exposure (E)
and outcome (D) that is not causal (from E to D)
– Reversed causality (Information bias?)
– Confounding
– Selection bias
Thus, under the causal null hypothesis,
exposure and outcome will still be associated
Association vs. causation E - D associations can have three different
structural origins according to DAG theory:
1. Cause and effect (watch out for reversed causality)
2. Common cause (confounding)
3. Conditioning on a common effect (selection bias)
(Hernan et al. 2004)
Chance is not a
structural source
of association!
Appropriate design and
analytical strategy 1. Design that avoids reversed causality
2. Control confounding by blocking
backdoor paths from E to D
(conditioning)
3. ....identify selection bias
introduced by conditioning
Small Group Discussion DAGs
Which variables should we adjust for in order to estimate
1) the total (direct + indirect) effect
2) the direct effect
of neighborhood violence on CVD? Motivate your answers!
(Fleischer & Diez Roux 2008)
• Control confounding
by conditioning
• Identify selection bias
from conditioning
Confounding controls in DAGs
There exists formal methods (and software) to
1. Determine the set S of covariates that is necessary to control
for confounding
2. Determine whether set S of covariates is minimally sufficient to
control for confounding
Have we discovered all unblocked backdoor paths?
Is there redundancy in the set of blocking variables?
(Fleischer & Diez Roux 2008)
Minimally sufficient?
1. Delete all arrows starting at E (Neighbourhood violence)
2. Connect all variables that share a child or descendent in S
3. Is there any unblocked backdoor paths from E to D (CVD)
that does not pass through S?
Suppose we think
that S={Income, PA}
is sufficient to control
for when estimating
the direct effect?
If you still think you can rely
on your intuition...
Which variables should we adjust for in order to estimate
the effect of E and D? Motivate your answer!
Z1
Z3
Z2
Z4 Z5
Z6 E D
(Adopted from Pearl 2009, p. 80)
DAGs – concerns
and limitations • How much should be included?
– All common causes must be included
– A complete DAG for several exposures and outcomes
can be quite messy
• Binary nature
– Effect / no effect
– Effect size, dose-response, magnitude of interaction etc. cannot be
depicted
• Assumes a “perfect” study setting
– Correctly specified model, no measurement errors,
continuous monitoring of outcome in longitudinal settings etc.
– Limited guidance in the choice of analytical strategy
in less perfect settings (e.g. trade-off confounding vs. selection)
DAGs – How much should be included?
(de Jong et al. 2012)
DAGs in longitudinal survey settings
Time
Different types of effects
1. Trigger effect
2. Effect of long-time exposure
3. Effect with long-time effect on outcome
4. Delayed effect
Dt
Et
Dt - 1
Et - 1
t -1 t
Common cause
Common consequence (collider)
Additional Reading • Pearl J. Causality – models, reasoning and inference.
Cambridge University Press 2009 (second edition)
• Fleischer NL & Diez Roux AV. Using directed acyclic
graphs to guide analyses of neighbourhood health
effects: an introduction. J Epidemiol Community
Health 2008;62:842-846
• Hernan et al. A structural approach to selection bias.
Epidemiology 2004;15:615-625