79
A Bayesian Approach A Bayesian Approach to the Reading Process: to the Reading Process: From Networks to Human From Networks to Human Data Data David A. Medler Center for the Neural Basis of Cognition Carnegie Mellon University

A Bayesian Approach to the Reading Process: From Networks to Human Data

  • Upload
    rafi

  • View
    66

  • Download
    2

Embed Size (px)

DESCRIPTION

A Bayesian Approach to the Reading Process: From Networks to Human Data. David A. Medler Center for the Neural Basis of Cognition Carnegie Mellon University. Bayesian Connections. The Bayesian Approach to Cognitive Neuroscience How do we represent the world? - PowerPoint PPT Presentation

Citation preview

Page 1: A Bayesian Approach to the Reading Process: From Networks to Human Data

A Bayesian ApproachA Bayesian Approachto the Reading Process:to the Reading Process:

From Networks to Human DataFrom Networks to Human Data

David A. MedlerCenter for the Neural Basis of Cognition

Carnegie Mellon University

Page 2: A Bayesian Approach to the Reading Process: From Networks to Human Data

Bayesian ConnectionsBayesian Connections

• The Bayesian Approach to Cognitive Neuroscience– How do we represent the world?– Bayesian Connectionist Framework.

• Bayesian Generative Networks– Learning letters.– How does context affect learning?– Empirical and Simulation Results.

• Symmetric Diffusion Networks– The Ambiguity Advantage/Disadvantage.

• Closing Remarks

Page 3: A Bayesian Approach to the Reading Process: From Networks to Human Data

P( )P( )

Representing the WorldRepresenting the World

• Problem: how do we form meaningful internal representations, P(H), given our observations of the external world, P(D)?

DH

Page 4: A Bayesian Approach to the Reading Process: From Networks to Human Data

• For a given hypothesis, H, and observed data, D, the posterior probability of H given D is computed as:

Bayesian TheoryBayesian Theory

(D)(H)H)(D|

D)|(HP

PPP

where– P(H) = prior probability of the hypothesis, H– P(H ) = probability of the data, D– P(D | H) = probability of D given H

Page 5: A Bayesian Approach to the Reading Process: From Networks to Human Data

• For a given hypothesis, H, and observed data, D, the posterior probability of H given D is computed as:

Bayesian TheoryBayesian Theory

(H)(D)D)|(H

H)(D|P

PPP

where– P(H) = prior probability of the hypothesis, H– P(H ) = probability of the data, D– P(D | H) = probability of D given H

Page 6: A Bayesian Approach to the Reading Process: From Networks to Human Data

Bayesian ConnectionismBayesian Connectionism

Representation LayerP(H)

Mediating Layer

P(D) Surface Layer

Page 7: A Bayesian Approach to the Reading Process: From Networks to Human Data

Bayesian ConnectionismBayesian Connectionism

Representation LayerP(H)

Mediating Layer

P(D) Surface Layer

Page 8: A Bayesian Approach to the Reading Process: From Networks to Human Data

Representation LayerP(H)

Mediating Layer

P(D) Surface Layer

Bayesian ConnectionismBayesian Connectionism

Page 9: A Bayesian Approach to the Reading Process: From Networks to Human Data

It was 20 years ago today...It was 20 years ago today...

An Interactive Activation Model of Context Effects in Letter Perception

James L. McClelland & David E. Rumelhart (1981; 1982)

• Word superiority effect– words > pseudowords > nonwords

• The model accounted for the time course of perceptual identification.

Page 10: A Bayesian Approach to the Reading Process: From Networks to Human Data

Interactive Activation ModelInteractive Activation Model

FeatureLevel

LetterLevel

WordLevel

Page 11: A Bayesian Approach to the Reading Process: From Networks to Human Data

Interactive Activation ModelInteractive Activation Model

FeatureLevel

LetterLevel

WordLevel

Page 12: A Bayesian Approach to the Reading Process: From Networks to Human Data

20 Years Later...20 Years Later...

• Interactive Activation (IA) Model has been influential.

• Many positives, but 20 years of negatives.

• Internal representations are hard-coded:

The Interactive Activation Model does not learn!

Page 13: A Bayesian Approach to the Reading Process: From Networks to Human Data

Bayesian ConnectionsBayesian Connections

• The Bayesian Approach to Cognitive Neuroscience– How do we represent the world?– Bayesian Connectionist Framework.

• Bayesian Generative Networks– Learning letters.– How does context affect learning?– Empirical and Simulation Results.

• Symmetric Diffusion Networks– The Ambiguity Advantage/Disadvantage.

• Closing Remarks

Page 14: A Bayesian Approach to the Reading Process: From Networks to Human Data

Bayesian Generative NetworksBayesian Generative Networks

• Initial work is an expansion of the Bayesian Generative Network framework of Lewicki & Sejnowski, 1997.

• It is an unsupervised learning paradigm for multilayered architectures.

• Simplified network equations, added sparse coding constraints, & included a “supervised” component.

Page 15: A Bayesian Approach to the Reading Process: From Networks to Human Data

P(D) Surface Layer

Representation LayerP(H)

Mediating Layer

Bayesian Generative NetworksBayesian Generative Networks

Page 16: A Bayesian Approach to the Reading Process: From Networks to Human Data

Bayesian Generative NetworksBayesian Generative Networks

P(D) Surface Layer

Representation LayerP(H)

Mediating Layer

Page 17: A Bayesian Approach to the Reading Process: From Networks to Human Data

Sparse Coding ConstraintsSparse Coding Constraints

• Modified the basic framework to include “sparse coding” constraints.

• These are a Bayesian prior that constrain the types of representations learned.

• Sparse coding encourages the network to represent any given input pattern with relatively few units.

Page 18: A Bayesian Approach to the Reading Process: From Networks to Human Data

Step 1: Learning the AlphabetStep 1: Learning the Alphabet

• First stage of the IA model is the mapping between features and letters.

• We use the Rumelhart & Siple (1974) character features.

Page 19: A Bayesian Approach to the Reading Process: From Networks to Human Data

Network LearningNetwork Learning

• 16 surface units (corresponding to 16 line segments)

• 30 representation units

• Trained for 50 epochs (evaluated at 1, 10, 25 & 50)

• Evaluated:– Generative capability of the network– Internal representations formed

Page 20: A Bayesian Approach to the Reading Process: From Networks to Human Data

Generating the AlphabetGenerating the Alphabet

0

1

2

3

4

5

6

Ave

rage

"Se

gmen

t"

Err

or

1 10 25 50

Epoch

No Sparse Coding Sparse Coding

Page 21: A Bayesian Approach to the Reading Process: From Networks to Human Data

Interpreting Weight StructureInterpreting Weight Structure

Page 22: A Bayesian Approach to the Reading Process: From Networks to Human Data

Interpreting Weight StructureInterpreting Weight Structure

Page 23: A Bayesian Approach to the Reading Process: From Networks to Human Data

Network WeightsNetwork Weights

Unit 20Unit 19Unit 18Unit 17Unit 16Unit 15Unit 14Unit 13Unit 12Unit 11

Unit 30Unit 29Unit 28Unit 27Unit 26Unit 25Unit 24Unit 23Unit 22Unit 21

Unit 10Unit 9Unit 8Unit 7Unit 6Unit 5Unit 4Unit 3Unit 2Unit 1

Unit 20Unit 19Unit 18Unit 17Unit 16Unit 15Unit 14Unit 13Unit 12Unit 11

Unit 30Unit 29Unit 28Unit 27Unit 26Unit 25Unit 24Unit 23Unit 22Unit 21

Unit 10Unit 9Unit 8Unit 7Unit 6Unit 5Unit 4Unit 3Unit 2Unit 1

No SparseCoding

SparseCoding

Epoch: 1

Page 24: A Bayesian Approach to the Reading Process: From Networks to Human Data

Network WeightsNetwork Weights

Unit 20Unit 19Unit 18Unit 17Unit 16Unit 15Unit 14Unit 13Unit 12Unit 11

Unit 30Unit 29Unit 28Unit 27Unit 26Unit 25Unit 24Unit 23Unit 22Unit 21

Unit 10Unit 9Unit 8Unit 7Unit 6Unit 5Unit 4Unit 3Unit 2Unit 1

Unit 20Unit 19Unit 18Unit 17Unit 16Unit 15Unit 14Unit 13Unit 12Unit 11

Unit 30Unit 29Unit 28Unit 27Unit 26Unit 25Unit 24Unit 23Unit 22Unit 21

Unit 10Unit 9Unit 8Unit 7Unit 6Unit 5Unit 4Unit 3Unit 2Unit 1

No SparseCoding

SparseCoding

Epoch: 10

Page 25: A Bayesian Approach to the Reading Process: From Networks to Human Data

Network WeightsNetwork Weights25

Unit 20Unit 19Unit 18Unit 17Unit 16Unit 15Unit 14Unit 13Unit 12Unit 11

Unit 30Unit 29Unit 28Unit 27Unit 26Unit 25Unit 24Unit 23Unit 22Unit 21

Unit 10Unit 9Unit 8Unit 7Unit 6Unit 5Unit 4Unit 3Unit 2Unit 1

Unit 20Unit 19Unit 18Unit 17Unit 16Unit 15Unit 14Unit 13Unit 12Unit 11

Unit 30Unit 29Unit 28Unit 27Unit 26Unit 25Unit 24Unit 23Unit 22Unit 21

Unit 10Unit 9Unit 8Unit 7Unit 6Unit 5Unit 4Unit 3Unit 2Unit 1

No SparseCoding

SparseCoding

Epoch:

Page 26: A Bayesian Approach to the Reading Process: From Networks to Human Data

Network WeightsNetwork Weights

Unit 20Unit 19Unit 18Unit 17Unit 16Unit 15Unit 14Unit 13Unit 12Unit 11

Unit 30Unit 29Unit 28Unit 27Unit 26Unit 25Unit 24Unit 23Unit 22Unit 21

Unit 10Unit 9Unit 8Unit 7Unit 6Unit 5Unit 4Unit 3Unit 2Unit 1

Unit 20Unit 19Unit 18Unit 17Unit 16Unit 15Unit 14Unit 13Unit 12Unit 11

Unit 30Unit 29Unit 28Unit 27Unit 26Unit 25Unit 24Unit 23Unit 22Unit 21

Unit 10Unit 9Unit 8Unit 7Unit 6Unit 5Unit 4Unit 3Unit 2Unit 1

50

No SparseCoding

SparseCoding

Epoch:

Page 27: A Bayesian Approach to the Reading Process: From Networks to Human Data

What We Have LearnedWhat We Have Learned

• In the unsupervised framework, the Bayesian Generative Network is able to learn the alphabet.

• Representations are not necessarily the same as the IA model.– distributed (not localist)– redundant (features are coded several times)

• Having learned the letters, can we now learn words?

Page 28: A Bayesian Approach to the Reading Process: From Networks to Human Data

Step 2: Learning WordsStep 2: Learning Words

• The second stage of the IA model is the mapping from letters to words.

• Interested in how the Bayesian framework accounts for the development of contextual regularities (i.e., letters within words).

• Look at participants’ learning of context.

Page 29: A Bayesian Approach to the Reading Process: From Networks to Human Data

Experimental MotivationExperimental Motivation

• Our motivation for the current experiments is the word-superiority effect.

• Specifically, we draw inspiration from the Reicher-Wheeler paradigm.

KQZW--Z-

--S-+ GLUR

---P

---R+ READ

-E--

-O--+KQZW

--Z-

--S-GLUR---P

---RREAD-E--

-O--

Page 30: A Bayesian Approach to the Reading Process: From Networks to Human Data

The TaskThe Task

• The current set of studies was designed to simulate how the word superiority may develop. Specifically we were interested in:– the learning of novel, letter-like stimuli– whether stimuli were learned in parts or wholes– the effects of context on learning.

• Consequently, we created an artificial environment in which we tightly controlled context.

Page 31: A Bayesian Approach to the Reading Process: From Networks to Human Data

Experimental Design: TrainingExperimental Design: Training

• Reicher-Wheeler task is based the discrimination between two characters.

• Wanted a similar task in which context would interact with a character pair.

A

a b cd e f

p1 p2 p3

o1

o2

B

g h i j k l

p1 p2 p3

o1

o2

Page 32: A Bayesian Approach to the Reading Process: From Networks to Human Data

Experimental Design: TestingExperimental Design: Testing

• Testing: 288 Stimuli

a e c g k l

– 96 Familiar Stimuli:

j e c g k f

– 96 Crossed Stimuli:

• Total of 16 stimuli– Detect change

a b cd b ca e cd e ca b fd b fa e fd e f

Ag h i j h ig k i j k ig h l j h lg k l j k l

B

a e r g n l

– 96 Novel Stimuli:

AAABBB

BAAABB

CAACBB

Page 33: A Bayesian Approach to the Reading Process: From Networks to Human Data

• Characters were constructed from the RS features.

• Each character had six line segments with the following constraints:

StimuliStimuli

– characters were continuous

– no two segments formed a straight line

– no character was a mirror image nor rotation of another.

p1 p2 p3

o1

o2

Ap1 p2 p3

o1

o2

B

Page 34: A Bayesian Approach to the Reading Process: From Networks to Human Data

Initial SimulationsInitial Simulations

Character 1 Character 2 Character 3

18

48P(D)

16P(H)

Page 35: A Bayesian Approach to the Reading Process: From Networks to Human Data

Initial SimulationsInitial Simulations

Character 1 Character 2 Character 3

18

48P(D)

16P(H)

n

iii GPTPdiff

1

)()(1

Performance was measured by computing a “differentiation value” based on the difference between the generated surface layer representation (Gi) and the target representation (Ti).

Page 36: A Bayesian Approach to the Reading Process: From Networks to Human Data

Initial Simulation ResultsInitial Simulation Results

1.00E-24

1.00E-22

1.00E-20

1.00E-18

1.00E-16

1.00E-14

1.00E-12

1.00E-10

1.00E-08

1.00E-06

1.00E-04

1.00E-02

1.00E+00

Dif

fere

ntia

tion

Vau

le

2 2wt 3 3wt 3sp 3sp/wt

Network Architecture

FamiliarCrossedNovel

Page 37: A Bayesian Approach to the Reading Process: From Networks to Human Data

Simulation ConclusionsSimulation Conclusions

• Regardless of the network architecture, all simulations showed a (slight) difference between the familiar and crossed stimuli.

• No simulation performed well on the novel stimuli in comparison to the other stimuli.

• These results are somewhat counter to what we expected.

• Is the model broken?

• How do participants perform on this task?

Page 38: A Bayesian Approach to the Reading Process: From Networks to Human Data

Stimulus PresentationStimulus Presentation

500 ms

250 ms

200 ms

250 ms

200 ms

50 ms

Page 39: A Bayesian Approach to the Reading Process: From Networks to Human Data

Stimulus PresentationStimulus Presentation

Page 40: A Bayesian Approach to the Reading Process: From Networks to Human Data

Data AnalysisData Analysis

• Each participant’s reaction time and proportion of “hits” and “correct rejections” were recorded.

• To correct for potential responder biases, the scores were converted to d’ scores using:

CR

Hit Miss

FA

“No”“Yes”

Differ

SameSti

mul

i

Detect Change?

d’ = ni(Hit) + ni(CR)

Page 41: A Bayesian Approach to the Reading Process: From Networks to Human Data

• 4 Participants, 10 days each

• 1440 trials per day:– 288 test trials intermixed with 1152 training

trials.

• Three conditions:– Familiar (AAA or BBB)– Crossed (BAA or ABB)– Novel (CAA or CBB)

Experiment 1: One NovelExperiment 1: One Novel

Page 42: A Bayesian Approach to the Reading Process: From Networks to Human Data

d’ Scoresd’ Scores

-1

-0.5

0

0.5

1

1.5

2

1 2 3 4 5 6 7 8 9 10

Days

d'

FamiliarCrossedNovel

Page 43: A Bayesian Approach to the Reading Process: From Networks to Human Data

00.10.20.30.40.50.60.70.80.9

1

1 2 3 4 5 6 7 8 9 10

Days

Pro

port

ion

"Cha

nge"

Res

pons

e

Fam-HitFam-FACro-HitCro-FANov-HitNov-FA

Reporting ChangesReporting Changes

Page 44: A Bayesian Approach to the Reading Process: From Networks to Human Data

050

100150200250300350400450500

1 2 3 4 5 6 7 8 9 10

Days

Rea

ctio

n T

ime

(ms)

Familiar-C

Familiar-S

Cross-C

Cross-S

Novel-C

Novel-S

Reaction TimesReaction Times

Page 45: A Bayesian Approach to the Reading Process: From Networks to Human Data

Experiment ConclusionsExperiment Conclusions

• Although there is a context effect, it is not as large as we expected, nor as stable.

• There are no significant differences in reaction times for any of the conditions.

• Participants do not perform well in the Novel condition– this is due to a tendency to respond “Change”

to all novel stimuli

Page 46: A Bayesian Approach to the Reading Process: From Networks to Human Data

Re-Simulation of TaskRe-Simulation of Task

• The network was trained on the same data set that the participants were trained on.

• Network learned on all training/testing trials

• Wanted a similar measure for network performance.

• Used a variant of the Kullback-Leibler divergence measure:

n

i i

ii

i

ii yf

ygyg

yf

ygygKL

1 )1(

)1(log)1(

)(

)(log)(

Page 47: A Bayesian Approach to the Reading Process: From Networks to Human Data

0

10

20

30

40

50

60

1 2 3 4 5 6 7 8 9 10

Network "Days"

K-L

Dif

fere

nce

Mea

sure

FamiliarCrossedNovel

Simulation: Difference MeasureSimulation: Difference Measure

Page 48: A Bayesian Approach to the Reading Process: From Networks to Human Data

0

10

20

30

40

50

60

70

80

1 2 3 4 5 6 7 8 9 10

Network "Days"

K-L

Mea

sure

Fam-HitFam-FACro-HitCro-FANov-HitNov-FA

Simulation: Report Change?Simulation: Report Change?

Page 49: A Bayesian Approach to the Reading Process: From Networks to Human Data

Internal RepresentationsInternal Representations

• If we look at the internal representations formed by the network, we get an idea of why it behaves as it does...

Unit 18Unit 17Unit 16Unit 15Unit 14Unit 13

Unit 12Unit 11Unit 10Unit 9Unit 8Unit 7

Unit 6Unit 5Unit 4Unit 3Unit 2Unit 1

Training “Day”: 1

Page 50: A Bayesian Approach to the Reading Process: From Networks to Human Data

Internal RepresentationsInternal Representations

• If we look at the internal representations formed by the network, we get an idea of why it behaves as it does...

Unit 18Unit 17Unit 16Unit 15Unit 14Unit 13

Unit 12Unit 11Unit 10Unit 9Unit 8Unit 7

Unit 6Unit 5Unit 4Unit 3Unit 2Unit 1

6Training “Day”:

Page 51: A Bayesian Approach to the Reading Process: From Networks to Human Data

Internal RepresentationsInternal Representations

• If we look at the internal representations formed by the network, we get an idea of why it behaves as it does...

Unit 18Unit 17Unit 16Unit 15Unit 14Unit 13

Unit 12Unit 11Unit 10Unit 9Unit 8Unit 7

Unit 6Unit 5Unit 4Unit 3Unit 2Unit 1

10Training “Day”:

Page 52: A Bayesian Approach to the Reading Process: From Networks to Human Data

– The network failed to learn to represent novel items.

– Thus, if the first generated representation is garbage, and the second generated representation is garbage, then the comparison will be garbage

Simulation ConclusionsSimulation Conclusions

• The Bayesian Generative Network qualitatively matched the performance of the participants.

• Furthermore, analysis of the internal structure of the network offers an explanation for the participants’ behaviour.

“change”

Page 53: A Bayesian Approach to the Reading Process: From Networks to Human Data

Assessing RepresentationsAssessing Representations

• The models predicted that participants in the one novel condition would fail to learn to represent the novel items.

• Unfortunately, we can’t open up a person to see what their internal representation is.

• We can, however, ask them.– Specifically, we can test their recognition of

“novel” items following training and compare these to truly new items.

Page 54: A Bayesian Approach to the Reading Process: From Networks to Human Data

Experiment 2Experiment 2

• 10 Participants

• Trained on the same data as Experiment 1 but were only run for 2 days.

• At the conclusion of the training, participants were given a “new/old” task in which they saw the 12 old training items, the 6 old novel items, and 12 new items.

• Participants saw a single character, and made the judgement “old” or “new”.

Page 55: A Bayesian Approach to the Reading Process: From Networks to Human Data

Experiment 2: ResultsExperiment 2: Results

• Participants were about 70% correct at detecting “Old” items.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Stimulus Presented

Pro

port

ion

"Old

" R

espo

nses

OldNovelNew

• Participants were no better at recognizing old “Novel” items than truly “New” items.

Page 56: A Bayesian Approach to the Reading Process: From Networks to Human Data

Learning ContextLearning Context

• The Bayesian Generative Network is able to learn higher order information such as which characters appear in which positions.

• It is able to both simulate and explain the performance of participants trained on a contextual learning task.

• It is able to predict new findings!

• Can we expand the model?

Page 57: A Bayesian Approach to the Reading Process: From Networks to Human Data

Bayesian ConnectionsBayesian Connections

• The Bayesian Approach to Cognitive Neuroscience– How do we represent the world?– Bayesian Connectionist Framework.

• Bayesian Generative Networks– Learning letters.– How does context affect learning?– Empirical and Simulation Results.

• Symmetric Diffusion Networks– The Ambiguity Advantage/Disadvantage.

• Closing Remarks

Page 58: A Bayesian Approach to the Reading Process: From Networks to Human Data

Symmetric Diffusion NetworkSymmetric Diffusion Network

• Symmetric Diffusion Networks (SDN) are a class of networks that explicitly embody many of the implicit assumptions made be the Bayesian Generative Network.

• SDN’s can be viewed as a more general form of the Bayesian Generative Network.

Page 59: A Bayesian Approach to the Reading Process: From Networks to Human Data

Symmetric Diffusion NetworkSymmetric Diffusion Network

Representation LayerP(H)

Mediating Layer

P(D) Surface Layer

Page 60: A Bayesian Approach to the Reading Process: From Networks to Human Data

Symmetric Diffusion NetworkSymmetric Diffusion Network

Representation LayerP(H)

Mediating Layer

P(D) Surface Layer

Page 61: A Bayesian Approach to the Reading Process: From Networks to Human Data

Symmetric Diffusion NetworkSymmetric Diffusion Network

Representation LayerP(H)

Mediating Layer

P(D) Surface Layer

Supervised Learning

Page 62: A Bayesian Approach to the Reading Process: From Networks to Human Data

Symmetric Diffusion NetworkSymmetric Diffusion Network

Representation LayerP(H)

Mediating Layer

P(D) Surface Layer

Unsupervised Learning

Page 63: A Bayesian Approach to the Reading Process: From Networks to Human Data

SDN RepresentationSDN Representation

• One advantage of the SDN is that it is able to learn continuous probability distributions.

• That is, it can learn multiple representations for the same input data. – bat (“flying mammal”) vs. bat (“wooden stick”)

• The SDN can be used to address the ambiguity paradox.

Page 64: A Bayesian Approach to the Reading Process: From Networks to Human Data

The Ambiguity ParadoxThe Ambiguity Paradox

500

550

600

650

700

750

800

850

Rea

ctio

n T

ime

(ms)

Lexical Semantic

Unambiguous Ambiguous Non-Word

chargechance

chathe

Unambiguous:Ambiguous:Non-Word:

Is it a word? feeluck

thakeIs it related?

Page 65: A Bayesian Approach to the Reading Process: From Networks to Human Data

One Possible ExplanationOne Possible Explanation

• “Efficient then Inefficient” Hypothesis(Piercey & Joordens, 2000)– Efficient: The ambiguity advantage results from

a “blend” state.– Inefficient: The ambiguity disadvantage occurs

in relatedness judgements because it takes longer to settle into a correct meaning

Page 66: A Bayesian Approach to the Reading Process: From Networks to Human Data

An Alternative ExplanationAn Alternative Explanation

• The Symmetric Diffusion Network offers an alternative explanation.

• Lexical Decisions are faster (on average) because there are more “semantic attractors” to fall into.

• Semantic Decisions are slower (on average) because on some proportion of the trials, one must move between “attractors”.

Page 67: A Bayesian Approach to the Reading Process: From Networks to Human Data

Preliminary ConclusionsPreliminary Conclusions

• Symmetric Diffusion Networks are able to learn ambiguous meanings (in contrast to other models).

• It has provided a plausible theory for the ambiguity paradox.

• It suggests new empirical studies.

• Larger network simulations are underway.

Page 68: A Bayesian Approach to the Reading Process: From Networks to Human Data

Bayesian ConnectionsBayesian Connections

• The Bayesian Approach to Cognitive Neuroscience– How do we represent the world?– Bayesian Connectionist Framework.

• Bayesian Generative Networks– Learning letters.– How does context affect learning?– Empirical and Simulation Results.

• Symmetric Diffusion Networks– The Ambiguity Advantage/Disadvantage.

• Closing Remarks

Page 69: A Bayesian Approach to the Reading Process: From Networks to Human Data

What have we learned?What have we learned?

• Introduced a class of connectionist networks that embody Bayesian principles.

• Using the IA model as inspiration, we:– Compared the letter representations learned

versus the hard-coded representations.– Simulated, explained, & predicted empirical

data on context learning.– Addressed the ambiguity paradox.

Page 70: A Bayesian Approach to the Reading Process: From Networks to Human Data

The Next 20+ YearsThe Next 20+ Years

• Explore the Bayesian framework and how it relates to connectionism to a fuller extent.– Continue research on learning and how it

interacts with the IA model and aspects of the reading process.

• Make links to neurophysiology– Lexicon vs. Semantic Representations?

(one area or two?)

Page 71: A Bayesian Approach to the Reading Process: From Networks to Human Data

The “Take Home” MessageThe “Take Home” Message

• We are able to effectively model aspects of the reading process with connectionist networks embodying Bayesian principles!

• These networks are able to qualitatively simulate observed data.

• These networks are able to predict new findings.• Using very simple principles, these networks offer

plausible explanations for a range of behaviours.

Page 72: A Bayesian Approach to the Reading Process: From Networks to Human Data

Jay McClelland

Michael Lewicki

Tai Sing Lee

Michael Harm

David Noelle

Chris Kello

Darren Piercey

AcknowledgementsAcknowledgements

Page 73: A Bayesian Approach to the Reading Process: From Networks to Human Data
Page 74: A Bayesian Approach to the Reading Process: From Networks to Human Data

Ambiguity AdvantageAmbiguity Advantage

“a measure of how likely it is that some event will occur”

“a financial liability”

“a pleading describingsome wrong or offense”

“chance”“chance”

“chance”

“Semantic Space”

Page 75: A Bayesian Approach to the Reading Process: From Networks to Human Data

Ambiguity AdvantageAmbiguity Advantage

“a measure of how likely it is that some event will occur”

“a financial liability”

“Semantic Space”

“a pleading describingsome wrong or offense”

“complaint”

“complaint”

“complaint”

Page 76: A Bayesian Approach to the Reading Process: From Networks to Human Data

Ambiguity AdvantageAmbiguity Advantage

“a measure of how likely it is that some event will occur”

“a financial liability”

“Semantic Space”

“a pleading describingsome wrong or offense”

“tax”

“tax”

“tax”

Page 77: A Bayesian Approach to the Reading Process: From Networks to Human Data

“a measure of how likely it is that some event will occur”

“Semantic Space”

Ambiguity AdvantageAmbiguity Advantage

“a pleading describingsome wrong or offense”

“charge”“charge”

“a financial liability”“charge”

Page 78: A Bayesian Approach to the Reading Process: From Networks to Human Data

Ambiguity AdvantageAmbiguity Advantage

“a measure of how likely it is that some event will occur”

“Semantic Space”

“a pleading describingsome wrong or offense”

“charge”“charge”

“a financial liability”“charge”

Page 79: A Bayesian Approach to the Reading Process: From Networks to Human Data

Ambiguity DisadvantageAmbiguity Disadvantage

“a measure of how likely it is that some event will occur”

“a financial liability”

“a pleading describingsome wrong or offense”

“charge”“charge”

“charge”

“Semantic Space”

“complaint”