45
29 marzo 08 Greenland 1 The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California

The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 1

The Logic of Biases via

Causal Diagrams

Sander Greenland

Epidemiology and Statistics

University of California

Page 2: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 2

Two definitions of bias:

• Epidemiology: Nonrandom differencebetween an estimate and the true value ofthe target parameter; systematic error;invalidity.

• Statistics: Any difference between theaverage value of an estimator and the truevalue of the target parameter (e.g., arelative risk)

There are subtle differences between thetwo; the second definition subsumes otherimportant problems.

Page 3: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 3

Types of bias

Epi categories (overlapping):

• Confounding (nonrandom exposure)

• Selection bias (nonrandom sampling)

• Bias from measurement error

Further statistical categories (often importantbut overlooked in epidemiology):

• Bias from use of a wrong model form(model-form mis-specification)

• Method invalidity (e.g., stepwise selection)

• Method failure (e.g., sparse-data bias)

Page 4: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 4

• There are many finer divisions of epi bias,

but they obscure the underlying deductive

logic of the biases

• Logic is about conclusions that could be

drawn regardless of the content

• Logical deduction concerns what must

follow from what is assumed

• Deductions can only be hypotheticals of

the form “If we assume this, we can

deduce that…” (some would say this is all

science can offer beyond data)

Page 5: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 5

The easiest way to remember the

logic of epidemiologic biases:

Causal diagrams

• Causal diagrams are schematics for

causal explanations (e.g., “Process P may

have caused bias B”) of possible

associations.

• Diagramming a study can reveal many

avenues for bias that are otherwise

overlooked.

Page 6: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 6

Directed acyclic graphs (DAGs)

and causal diagrams• A directed acyclic graph shows the factors

in the problem linked by arrows only, with

no feedback loops.

• A graph is a causal diagram if the arrows

are interpreted as links in causal chains

• Causal effects of one variable on another

are transmitted by causal sequences,

which are directed (head-tail) paths:

X Y Z means X can affect Z

Page 7: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 7

Example DAG: A and B can affect

any variable except each other

A B

C

F

E D

Page 8: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 8

Colliders vs. noncolliders on a path• Paths are closed at colliders: Associations

cannot be transmitted across a collider( C ) on a path unless we stratify(condition) on it or something it affects(such as F in C F).

• Paths are open (unblocked) atnoncolliders: Associations can betransmitted across a noncollider ( Cor C ) on a path unless we docompletely stratify on it.

Page 9: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 9

Think of associations as signals

flowing through the graph

• A variable can transmit associations along

some open (unblocked) directions but not

along closed (blocked) directions.

• The open and closed directions are

switched by conditioning (stratifying) on

the variable (and may be partially switched

by partially or indirectly conditioning)

Page 10: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 10

Spot the open and closed

directions for C:

A B

C

F

E D

Page 11: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 11

Colliders vs. noncolliders on a path

• Associations may be transmitted across a

collider ( C ) on a path if we stratify

(condition) on it or something it affects

(such as F in C F).

• Associations may be transmitted across a

noncollider ( C or C ) on a path if

we do not completely stratify on it.

“(C)” = C unobserved, “[C]” = C conditioned

Page 12: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 12

Spot the open and closed

directions for C given C:

A B

[C]

F

E D

Page 13: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 13

Spot the open and closed

directions for C given F:

A B

C

[F]

E D

Page 14: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 14

Closed and open paths

• Closed (blocked) path: Closed at some

variable within the path, hence cannot

transmit associations.

• Open (unblocked) path: Open at all

variables within the path, hence can

transmit associations.

Conditioning may open some closed paths

and close some open paths

(C) = C unobserved, [C] = C conditioned

Page 15: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 15

Spot the open and closed paths,

and rank the signal strengths:

A B

C

F

E D

Page 16: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 16

Size of associations

• The more steps along a given path, the

more attenuated the signal (the weaker

the transmitted association, a.e.), but

• Distance along distinct paths are not

comparable unless all steps (arrows) in

both paths are assigned a size!

Page 17: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 17

EAC larger than EACD (a.e.), but

can’t say relative to ECD

A B

C

F

E D

Page 18: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 18

Spot the open and closed paths

given C:

A B

[C]

F

E D

Page 19: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 19

Spot the open and closed paths

given F:

A B

C

[F]

E D

Page 20: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 20

“Control” of bias

• Target path: A path that transmits some

of the effect we want to estimate; must be

a directed path from cause to effect.

• Biasing path: Any other open path

between the cause and effect variables.

• By judicious conditioning, we must close

all biasing paths without closing target

paths or opening new biasing paths.

This isn’t always possible with available data

Page 21: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 21

Confounding

There are many definitions, none universally

accepted. My definition:

• Noncausal association transmitted via

effects on the outcome

This definition appears to correspond best to

the intuitive definitions given since the 19th

century: Confounding is a mixing of the

effect of interest with other effects on the

outcome (Mill, 1843).

Page 22: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 22

Biasing paths I: Confounding paths

and confounders

• Confounding path: Any path capable of

transmitting confounding

• Confounder: Any variable within a

confounding path

• Without conditioning, all biasing paths in a

DAG are confounding paths,

HOWEVER,

• Upon conditioning other kinds of bias arise

Page 23: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 23

Confounding paths from E to D:

EACD, ECBD, ECD

A B

C

F

E D

Page 24: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 24

Confounding paths from E to D

after conditioning on C: EACBD

A B

[C]

F

E D

Page 25: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 25

Confounding paths from E to D:

EACD, ECBD, ECD, EACBD

A B

C

[F]

E D

Page 26: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 26

Confounding paths from E to D:

ECD

[A] [B]

C

[F]

E D

Page 27: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 27

Confounding paths from E to D:

None!

A [B]

[C]

F

E D

Page 28: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 28

Biasing path from A to B: ACB,

which is not a confounding path!

A B

[C]

F

E D

Page 29: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 29

Selection Bias

There are many definitions, none universallyaccepted. My definition:

• Noncausal association created bynonrandom selection.

This definition appears to correspond best tothe intuitive definitions given in epi textssince the mid-20th century.

• Confounding and selection bias overlap,but one is not always the other. (Usinggraphs, the distinction is not important.)

Page 30: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 30

Confounding that is not selection

bias: ECD

C

F

E D

Page 31: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 31

Selection bias that is not

confounding: Berksonian bias

E D

[S]

T

Uncontrollable biasing path: ESD

Page 32: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 32

Case-control matching is

Intentional selection bias

We must control the matching factor M

to block the bias induced by matching

M

E

[S] [D]

Page 33: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 33

M-bias that is both confounding &

selection bias (via EACBD)

A B

C

[S]

E D

Page 34: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 34

Collider bias: Selection bias and

confounding induced by conditioning

Many variations:

• Beksonian bias

• M-bias

• Confounding produced by control of

intermediates to estimate direct effects, or

by selection affected by intermediates

Page 35: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 35

E has no direct effect on D, but control of

C or F can make it appear so (via ECBD)

E B

[C]

F

D

Page 36: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 36

Instrumental variables:

ED = AED/AE or ED = FAED/FAE

A (B)

E

F

D

Page 37: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 37

Differential measurement error: Can’t tell

direction of bias without further info

A B

(E)

E* D

Page 38: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 38

Independent nondifferential error:

bias toward the null in typical cases

A C

(E)

E*

D

Page 39: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 39

Effect-Measure Modification

(Heterogeneity)

Sander Greenland

Epidemiology and Statistics

University of California

Page 40: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 40

The term “interaction” gets used for

several distinct phenomena:

• Biologic interaction (synergy, antagonism,

coaction): One factor changes the physical

mechanism of action of another.

• “Statistical interaction”: Change in a

measure (of effect or association) upon

change in a third factor.

In the 1970s, few researchers understood

the difference. Many still don’t today (e.g.,

in genetics)

Page 41: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 41

Solution: Invent new term for

“statistical interaction”…

Effect Modification (Miettinen, 1974)

Unfortunately, the term still suggests

biologic interaction, so Rothman and

Greenland (1998) call it

Effect-Measure Modification (EMM)

Greenland prefers heterogeneity (of effect

or association), already in use in statistics

Page 42: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 42

Consider the simple case

with no confounding:

C

E D

Page 43: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 43

Presence & direction of EMM depends

on measure! (Berkson,1958)

C=1 C=0

E=1 E=0 E=1 E=0

D=1 32 20 10 4

N 105 105 105 105

RD: 32-20 = 12 per 105 10-4 = 6 per 105

RR: 32/20 = 1.6 10/4 = 2.5

NOTE: NO CONFOUNDING PRESENT!

105

Page 44: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 44

Only the RD has a simple relation

to biologic interaction:• If RD changes across strata and there is

no bias, this implies biologic interaction

must be present (known in bioassay since

the 1920s)

• Unfortunately, nearly all epi studies

present RRs only, so confusion remains.

• EMM has no bearing on confounding:

Both require C to be a risk factor given E,

but one can be present with the other

absent!

Page 45: The Logic of Biases via Causal Diagrams€¦ · The Logic of Biases via Causal Diagrams Sander Greenland Epidemiology and Statistics University of California. ... (EMM) Greenland

29 marzo 08 Greenland 45

Most epi studies have little power

to detect EMM, hence:A literature distortion is created:

• Studies examine only the RR

• All of them fail to detect RR modification

• Hence reviewers conclude there is no RRmodification (that the RR is homogeneous)

BUT, if they had examined only RD instead,

• All of them would fail to detect RDmodification and reviewers would infer thatthe RD is homogeneous!