Statistik för kliniska prövningar

Introduction to DAGs Directed Acyclic Graphs

Metalund and SIMSAM EarlyLife

Seminar, 22 March 2013

Jonas Björk

E-mail: [email protected]

(Fleischer & Diez Roux 2008)

Introduction to DAGs

• Basic terminology and principles

• Classification of bias in the DAG framework

• Concerns and limitations

• Examples from analysis of

neighbourhood health effects

DAG – a logical system for

causal relationships Development of Western science is based on

two great achievements:

1. the invention of the formal logical system

by the Greek philosophers

2. the discovery of the possibility to find out causal

relationships by systematic experiment during the

Renaissance

(Albert Einstein, 1953; adopted from Pearl 2009)

Directed Acyclic Graphs (DAGs)

• Causal diagrams - visualize causal (structural)

relationships between variables

• Based on mathematical theory and reasoning

• Used in

– Epidemiology, social science

– Computer science, artificial Intelligence

– Economics, Business administration

– Cognitive science

– ...

• Minimize bias - identify appropriate (and

inappropriate) analytical strategies

DAGs - nodes and arrows

• Nodes represent variables

– Measured and unmeasured

– Observable and unobservable Structural Equation Models (SEMs);

• Directed arrows (single-headed)

show direct causal effects

(Hernan, Epidemiology 2004)

DAG Language • Direct effect (only one)

– E affects D directly if there is an arrow from E to D

E D

• Indirect effect (can be more than one) – E affects D indirectly if there are a sequence of

directed arrows starting in E and ending in D

E M D

• Children – Variables directly affected by E

– Descendants: directly or indirectly affected by E

• Parents – Variables that directly affect E

– Ancestors: all variables that affect E directly or indirectly

DAG Language - Example • Direct effect

• Indirect effect

• Children, Descendants

• Parents, Ancestors


Acyclic graphs

E0

D

Loops not allowed:

Temporal associations can be depicted

in the following way:

E

E1 D

Time moves from left to right

in the graph

Paths (E – Z – D) • Path

– Sequence of arrows connecting two variables,

regardless of the direction of the arrows

– E → Z ← D

– E → Z → D

– E ← Z → D

– E ← Z ← D

• Collider (common cause within a path)

– Variable Z in a path that has two arrows pointing into it

– E → Z ← D

– Blocks (breaks) the information chain between E and D

• Unblocked backdoor path from E to D

– Begins with arrow pointing into E

– Ends with arrow pointing into D

– Does not contain a collider

This is the

origin of

confounding

DAG Language - Example • Path

• Collider

• Unblocked backdoor

path


Common cause


If we want to illustrate the E-D association,

all common causes must be included,

otherwise the DAG is not considered causal

Common effect Common consequence

Collider on the path between E and D

• Conditioning (“knowing the value of”)

– Restriction

– Stratification

– Matching

– Adjustment


Creates an association

between E and D

DAG Language - Example • Common cause

• Common effect

• Conditioning


Bias

• Structural association between exposure (E)

and outcome (D) that is not causal (from E to D)

– Reversed causality (Information bias?)

– Confounding

– Selection bias

Thus, under the causal null hypothesis,

exposure and outcome will still be associated

Association vs. causation E - D associations can have three different

structural origins according to DAG theory:

1. Cause and effect (watch out for reversed causality)

2. Common cause (confounding)

3. Conditioning on a common effect (selection bias)

(Hernan et al. 2004)

Chance is not a

structural source

of association!

Appropriate design and

analytical strategy 1. Design that avoids reversed causality

2. Control confounding by blocking

backdoor paths from E to D

(conditioning)

3. ....identify selection bias

introduced by conditioning

Small Group Discussion DAGs

Which variables should we adjust for in order to estimate

1) the total (direct + indirect) effect

2) the direct effect

of neighborhood violence on CVD? Motivate your answers!


• Control confounding

by conditioning

• Identify selection bias

from conditioning

Confounding controls in DAGs

There exists formal methods (and software) to

1. Determine the set S of covariates that is necessary to control

for confounding

2. Determine whether set S of covariates is minimally sufficient to

control for confounding

Have we discovered all unblocked backdoor paths?

Is there redundancy in the set of blocking variables?


Minimally sufficient?

1. Delete all arrows starting at E (Neighbourhood violence)

2. Connect all variables that share a child or descendent in S

3. Is there any unblocked backdoor paths from E to D (CVD)

that does not pass through S?

Suppose we think

that S={Income, PA}

is sufficient to control

for when estimating

the direct effect?

If you still think you can rely

on your intuition...

Which variables should we adjust for in order to estimate

the effect of E and D? Motivate your answer!

Z1

Z3

Z2

Z4 Z5

Z6 E D

(Adopted from Pearl 2009, p. 80)

DAGs – concerns

and limitations • How much should be included?

– All common causes must be included

– A complete DAG for several exposures and outcomes

can be quite messy

• Binary nature

– Effect / no effect

– Effect size, dose-response, magnitude of interaction etc. cannot be

depicted

• Assumes a “perfect” study setting

– Correctly specified model, no measurement errors,

continuous monitoring of outcome in longitudinal settings etc.

– Limited guidance in the choice of analytical strategy

in less perfect settings (e.g. trade-off confounding vs. selection)

DAGs – How much should be included?

(de Jong et al. 2012)

DAGs in longitudinal survey settings

Time

Different types of effects

1. Trigger effect

2. Effect of long-time exposure

3. Effect with long-time effect on outcome

4. Delayed effect

Dt

Et

Dt - 1

Et - 1

t -1 t

Common cause

Common consequence (collider)

Additional Reading • Pearl J. Causality – models, reasoning and inference.

Cambridge University Press 2009 (second edition)

• Fleischer NL & Diez Roux AV. Using directed acyclic

graphs to guide analyses of neighbourhood health

effects: an introduction. J Epidemiol Community

Health 2008;62:842-846

• Hernan et al. A structural approach to selection bias.

Epidemiology 2004;15:615-625

Documents

Statistik för kliniska prövningar