Causes and coincidences Tom Griffiths Cognitive and Linguistic Sciences Brown University

Preview:

Citation preview

Causes and coincidences

Tom GriffithsCognitive and Linguistic Sciences

Brown University

“It could be that, collectively, the people in New York caused those lottery numbers to come up 9-1-1… If enough people all are thinking the same thing, at the same time, they can cause events to happen… It's called psychokinesis.”

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

(Halley, 1752)

76 y

ears

75 y

ears

The paradox of coincidences

How can coincidences simultaneously lead us toirrational conclusions and significant discoveries?

Outline

1. A Bayesian approach to causal induction

2. Coincidencesi. what makes a coincidence?

ii. rationality and irrationality

iii. the paradox of coincidences

3. Explaining inductive leaps

Outline

1. A Bayesian approach to causal induction

2. Coincidencesi. what makes a coincidence?

ii. rationality and irrationality

iii. the paradox of coincidences

3. Explaining inductive leaps

Causal induction

• Inferring causal structure from data

• A task we perform every day … – does caffeine increase productivity?

• … and throughout science– three comets or one?

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Reverend Thomas Bayes

Bayes’ theorem

∑∈′

′′=

Hh

hphdp

hphdpdhp

)()|(

)()|()|(

Posteriorprobability

Likelihood Priorprobability

Sum over space of hypothesesh: hypothesis

d: data

Bayesian causal induction

Hypotheses:

Likelihoods:

Priors:

Data:

causal structures

Causal graphical models(Pearl, 2000; Spirtes et al., 1993)

• Variables

X Y

Z

• Variables

• Structure

X Y

Z

Causal graphical models(Pearl, 2000; Spirtes et al., 1993)

X Y

Z

• Variables

• Structure

• Conditional probabilities

p(z|x,y)

p(x) p(y)

Defines probability distribution over variables(for both observation, and intervention)

Causal graphical models(Pearl, 2000; Spirtes et al., 1993)

Bayesian causal induction

Hypotheses:

Likelihoods:

Priors:

probability distribution over variables

Data: observations of variables

causal structures

a priori plausibility of structures

“Does C cause E?”(rate on a scale from 0 to 100)

E present (e+)

E absent (e-)

C present(c+)

C absent(c-)

a

b

c

d

Causal induction from contingencies

Buehner & Cheng (1997)

“Does the chemical cause gene expression?”(rate on a scale from 0 to 100)

E present (e+)

E absent (e-)

C present(c+)

C absent(c-)

6

2

4

4Gen

e

Chemical

People

Examined human judgments for all values of P(e+|c+) and P(e+|c-) in increments of 0.25

How can we explain these judgments?

Buehner & Cheng (1997)C

ausa

l rat

ing

Bayesian causal inductioncause chance

E

B C

E

B CB B

Hypotheses:

Likelihoods:

Priors:

each cause has an independent opportunity to produce the effect

p 1 - p

Data: frequency of cause-effect co-occurrence

Bayesian causal inductioncause chance

E

B C

E

B CB B

Hypotheses:

p(cause | d) =p(d | cause) p(cause)

p(d | cause) p(cause) + p(d | chance) p(chance)

Bayesian causal inductioncause chance

E

B C

E

B CB B

Hypotheses:

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

evidence for a causal relationship

People

Bayes (r = 0.97)

Buehner and Cheng (1997)

People

Bayes (r = 0.97)

Buehner and Cheng (1997)

P (r = 0.89)

Power (r = 0.88)

Other predictions

• Causal induction from contingency data– sample size effects– judgments for incomplete contingency tables

(Griffiths & Tenenbaum, in press)

• More complex cases– detectors (Tenenbaum & Griffiths, 2003)

– explosions (Griffiths, Baraff, & Tenenbaum, 2004)

– simple mechanical devices

A B

The stick-ball machine

(Kushnir, Schulz, Gopnik, & Danks, 2003)

Outline

1. A Bayesian approach to causal induction

2. Coincidencesi. what makes a coincidence?

ii. rationality and irrationality

iii. the paradox of coincidences

3. Explaining inductive leaps

What makes a coincidence?

A common definition: Coincidences are unlikely events

“an event which seems so unlikely that it is worth telling a story about”

“we sense that it is too unlikely to have been the result of luck or mere chance”

Coincidences are not just unlikely...

HHHHHHHHHHvs.

HHTHTHTTHT

Bayesian causal induction

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

Likelihood ratio(evidence)

Prior odds

high

low

highlow

cause

chance

?

?

Bayesian causal induction

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

Likelihood ratio(evidence)

Prior odds

high

low

highlow

cause

chance

coincidence

?

What makes a coincidence?

A coincidence is an event that provides evidence for causal structure, but not enough evidence to make us believe that structure exists

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

What makes a coincidence?

A coincidence is an event that provides evidence for causal structure, but not enough evidence to make us believe that structure exists

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

likelihood ratiois high

What makes a coincidence?

A coincidence is an event that provides evidence for causal structure, but not enough evidence to make us believe that structure exists

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

likelihood ratiois high

prior oddsare low

posterior oddsare middling

HHHHHHHHHH

HHTHTHTTHT

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

likelihood ratiois high

prior oddsare low

posterior oddsare middling

Bayesian causal inductioncause chance

E

C

Hypotheses:

Likelihoods:

Priors: p 1 - p

Data: frequency of effect in presence of cause

E

C

(small)

0 < p(E) < 1 p(E) = 0.5

HHHHHHHHHH

HHTHTHTTHT

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

likelihood ratiois high

prior oddsare low

posterior oddsare middling

likelihood ratiois low

prior oddsare low

posterior oddsare low

coincidence

chance

HHHHHHHHHHHHHHHHHH

HHHHHHHHHH

HHHHlikelihood ratio

is middlingprior odds

are lowposterior odds

are low

mere coincidence

likelihood ratiois high

prior oddsare low

posterior oddsare middling

suspicious coincidence

likelihood ratiois very high

prior oddsare low

posterior oddsare high

cause

Mere and suspicious coincidences

mere coincidence

suspiciouscoincidence

evidence for acausal relation

p(cause | d)

p(chance | d)

• Transition produced by– increase in likelihood ratio (e.g., coinflipping)– increase in prior odds (e.g., genetics vs. ESP)

Testing the definition

• Provide participants with data from experiments

• Manipulate:– cover story: genetic engineering vs. ESP (prior)– data: number of males/heads (likelihood)– task: “coincidence or evidence?” vs. “how likely?”

• Predictions:– coincidences affected by prior and likelihood– relationship between coincidence and posterior

47 51 55 59 63 70 87 99

r = -0.98

47 51 55 59 63 70 87 99

Number of heads/males

Prop

orti

on “

coin

cide

nce”

Post

erio

r pr

obab

ilit

y

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

Likelihood ratio(evidence)

Prior odds

high

low

highlow

cause

chance

coincidence

?

Rationality and irrationality

(Gilovich, 1991)

The bombing of London

(uniform)

Spread

Location

Ratio

Number

Change in... People

Bayesian causal inductioncause chanceHypotheses:

Likelihoods:

Priors: p 1 - p

uniformuniform

+regularity

T

X X XX

TT TT

X X XX

T

Data: bomb locations

r = 0.98

(uniform)

Spread

Location

Ratio

Number

Change in... People Bayes

May 14, July 8, August 21, December 25

vs.

August 3, August 3, August 3, August 3

Coincidences in date

People

Bayesian causal inductioncause chanceHypotheses:

Likelihoods:

Priors: p 1 - p

uniformuniform + regularity

P P PPP P PP

B B B

August

Data: birthdays of those present

People Bayes

• People’s sense of the strength of coincidences gives a close match to the likelihood ratio– bombing and birthdays

Rationality and irrationality

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

• People’s sense of the strength of coincidences gives a close match to the likelihood ratio– bombing and birthdays

• Suggests that we accept false conclusions when our prior odds are insufficiently low

Rationality and irrationality

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

Rationality and irrationality

Likelihood ratio(evidence)

Prior odds

high

low

highlow

cause

chance

coincidence

?

The paradox of coincidences

Prior odds can be low for two reasons

Incorrect current theory Significant discovery

Correct current theory False conclusion

Reason Consequence

Attending to coincidences makes more sense the less you know

Coincidences

• Provide evidence for causal structure, but not enough to make us believe that structure exists

• Intimately related to causal induction– an opportunity to discover a theory is wrong

• Guided by a well calibrated sense of when an event provides evidence of causal structure

Outline

1. A Bayesian approach to causal induction

2. Coincidencesi. what makes a coincidence?

ii. rationality and irrationality

iii. the paradox of coincidences

3. Explaining inductive leaps

Explaining inductive leaps

• How do people – infer causal relationships– identify the work of chance– predict the future– assess similarity and make generalizations– learn functions, languages, and concepts

. . . from such limited data?

• What knowledge guides human inferences?

Which sequence seems more random?

HHHHHHHHHHvs.

HHTHTHTTHT

Subjective randomness

• Typically evaluated in terms of p(d | chance)

• Assessing randomness is part of causal induction

p(chance | d)

p(cause | d)=

p(d | chance)

p(d | cause)

p(chance)

p(cause)evidence for a random

generating process

Randomness and coincidences

p(chance | d)

p(cause | d)=

p(d | chance)

p(d | cause)

p(chance)

p(cause)evidence for a random

generating process

p(cause | d)

p(chance | d)=

p(d | cause)

p(d | chance)

p(cause)

p(chance)

strength of coincidence

Randomness and coincidencesBombing

0

2

4

6

8

10

0 2 4 6 8 10

How big a coincidence?

How random?

Birthdays

0

2

4

6

8

10

0 2 4 6 8 10

How big a coincidence?

How random?

r = -0.96 r = -0.94

People

Bayes

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

Pick a random number…

Bayes’ theorem

∑∈′

′′=

Hh

hphdp

hphdpdhp

)()|(

)()|()|(

Bayes’ theorem

inference = f(data,knowledge)

Bayes’ theorem

inference = f(data,knowledge)

Predicting the future

Human predictions match optimal predictions from empirical prior

Iterated learning(Briscoe, 1998; Kirby, 2001)

data hypothesislea

rning

prod

uctio

n

data hypothesislea

rning

prod

uctio

n

d0 h1 d1 h2inf

erenc

e

sampli

ng

infere

nce

sampli

ng

p(h|d) p(d|h) p(d|h)p(h|d)

(Griffiths & Kalish, submitted)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

1 2 3 4 5 6 7 8 9

Iteration

Conclusion

• Many cognitive judgments are the result of challenging problems of induction

• Bayesian statistics provides a formal framework for exploring how people solve these problems

• Makes it possible to ask…– how do we make surprising discoveries?– how do we learn so much from so little?– what knowledge guides our judgments?

Collaborators

• Causal induction– Josh Tenenbaum (MIT)– Liz Baraff (MIT)

• Iterated learning– Mike Kalish (University of Louisiana)

Causes and coincidences

“coincidence” appears in 13/60 cases

p(“cause”) = 0.01

p(“cause”|“coincidence”) = 0.26

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

A reformulation: unlikely kinds

• Coincidences are events of an unlikely kind– e.g. a sequence with that number of heads

• Deals with the obvious problem...

p(10 heads) < p(5 heads, 5 tails)

Problems with unlikely kinds

• Defining kinds

August 3, August 3, August 3, August 3

January 12, March 22, March 22, July 19, October 1, December 8

Problems with unlikely kinds

• Defining kinds

• Counterexamples

P(4 heads) < P(2 heads, 2 tails)

P(4 heads) > P(15 heads, 8 tails)

HHHH > HHHHTHTTHHHTHTHHTHTTHHH

HHHH > HHTT

Recommended