8

Click here to load reader

Connectionist models of language deficits€¦ · Connectionist models of language deficits Fiona Richardson …remember Dell et al. (1997) ... processing resource common to regulars

Embed Size (px)

Citation preview

Page 1: Connectionist models of language deficits€¦ · Connectionist models of language deficits Fiona Richardson …remember Dell et al. (1997) ... processing resource common to regulars

11

Connectionist models of language deficits

Fiona Richardson

…remember Dell et al. (1997)

2-step theory of lexical retrieval

Interactive activation model

Model performance and patient data

‘lesioning’ the model

Matching model performance to data

Modelling…what’s the point?

Parameterise, develop, and test theories

Try to explain ‘why’?

Generate predictions

Overview…

Part 1, the principlesDissociations, modularity and functional specialisation

Part 2, in practiceThe connectionist models

Part 1: Dissociations and modularity

Shallice (1988, p248):

“If modules exist then…double dissociations are a relatively reliable way of uncovering them. Double dissociations exist, therefore modules exist”

but…

Page 2: Connectionist models of language deficits€¦ · Connectionist models of language deficits Fiona Richardson …remember Dell et al. (1997) ... processing resource common to regulars

22

Delusions about dissociations…?

Shallice describes a number of non-modular systems that could produce DDs.

Modules…?

Shallice (1988, p249):

“the idea that the existence of a double dissociation necessarily implies that the overall system has separate subcomponents can no longer be taken for granted”

“functional specialisation”

Shallice (1988): “functional specialisation” is a more appropriate inference from dissociations in patients

But the dimensions on which behavioural dissociations are based may not be a direct reflection of the function responsible for specialisation

If so, the degree of specialisation may not be a useful guide to system architecture and functional organisation

Dissociations and modules

Sartori (1988): Components in a fully serial architecture can only produce single dissociations . Some contribution of parallel organisation is required for double dissociations.

X1

X2

X1

X2 X3X1 X2

Parallel Distributed Processing

Double dissociations without specialisation?

Suppose the existence of three elements x, y, and z (which may be, for example, groups of neurons)

Task A performed with levels of activation

x=80% y=20% z=80%

Task B performed with levels of activation

x=20% y=80% z=20%

Dissociations in a distributed memory (Wood, 1977)

Input VectorsA

1

1

0

1

B

0

1

1

0

10 5 2 6 19 6 5 8 28 13 2 13 36 2 8 10 4

Matrixrow

24 14 16 20 Threshold

A’ 1 0 0 1B’ 0 1 0 1

Output vectors

1

B correct but not A

46

A correct but not B

Page 3: Connectionist models of language deficits€¦ · Connectionist models of language deficits Fiona Richardson …remember Dell et al. (1997) ... processing resource common to regulars

33

Part 2: Connectionist modelling

Abstract modelExpand range of candidate inferences from behaviour to structureThis kind of behaviour can be produced by these kinds of system

Specific modelsTheoretically motivated – test hypothesesUse empirically driven constraints

A little bit about lesioning…Degrading the signalRemoval of connections or unitsLesions may be focal or distributed

Input

Output

weights

weights

Hidden Units

Focal Diffuse

You can also…Add noiseChange processing unit discriminability

Model topics

Past tense debate Acquired deficitsDevelopmental deficitsEmergent modularityDouble dissociations with modularityConnectionism and modularity

English past tense formation

A “quasi-regular” domain

Regular: TALK - TALKEDIrregular: THINK - THOUGHT, HIT - HITRule: WUG - WUGGED

The past tense debate

Output pasttense

Blocking

Listing of exceptions/ associationistnetwork

Regularroute

Input Stem

Dual Mechanismmodel of past tenseformation (Pinker,1991, 1994)

Pinker / Chomsky Rumelhart & McClelland

Dissociations

WSSLIDevelopmental

Posterior lesionsAlzheimer’s D.

Frontal lesionsParkinson’s D.

Acquired

IrregularRegular

…but why?

Page 4: Connectionist models of language deficits€¦ · Connectionist models of language deficits Fiona Richardson …remember Dell et al. (1997) ... processing resource common to regulars

44

Modified modelsOutput past

tense

Blocking

Listing of exceptions/ associationistnetwork

Regularroute

Input Stem

Dual Mechanismmodel of past tenseformation (Pinker,1991, 1994)

Morphology

Semantics

Phonology

Groups of artificial neurons

Connections between groups

g

Multiple phonological rules

Associative network stores both regular and irregular forms based on frequency and can generalise

Distinction between phonology and word-specific information

Multiple inflection all paradigms (nouns, verbs)

Acquired past tense deficitsJoanisse and Seidenberg (1999)

speaking

hearing

repeating

transformation

Phonological deficits

** **

all word types affectedpoorest performance on nonwords

Semantic lesions

** **

all word types affectedpoorest performance on irregulars

Beware of models where developmental disorders are being used in a directly analogous fashion to acquired deficits

Is this really appropriate?

Developing systems are dynamic

A developing model is a training model

Stages as well as end performance may differ

Modelling developmental language deficits Specific Language Impairment

Deficit in regular inflection in SLI and frequency effects for regular verbs

Ullman and Pierpoint (2006): developmental deficit to a system specialised for grammar (procedural memory system)

Thomas (in press): same data can arise from a deficit to a processing resource common to regulars and irregulars

Page 5: Connectionist models of language deficits€¦ · Connectionist models of language deficits Fiona Richardson …remember Dell et al. (1997) ... processing resource common to regulars

55

The model

Thomas (in press)Thomas and Karmiloff-Smith (2003)

SEMANTICS VERB STEM PHONOLOGY

PAST TENSE PHONOLOGY

HIDDEN UNITS

/send/

/sent/Alter initial level of

processing unit discriminability

Input to Processing Unit-4.0 0.0 +4.0

0.0

0.2

0.4

0.6

0.8

1.0

Uni

t Out

put

T=1.0 (Normal)T=4.0 (Low discrimination)T=0.25 (High discrimination)

Unit activation function 'Temperatures' (T)

Transfer functions and category boundaries

Cliffs Cliffs –– sharp category boundary, sharp category boundary, good for rulegood for rule--like distinctionslike distinctions

Slopes Slopes –– broad category boundary, broad category boundary, good for finegood for fine--grained distinctionsgrained distinctions

The results

Thomas (in press)

Perception theory: impaired speech perception which affects the use of phonological information in working memory, which in turnleads to poor syntactic comprehension

Another cause of SLIJoanisse (2000), Joanisse and Seidenberg (2003)

Morphology

Semantics

Phonology

Groups of artificial neurons

Connections between groups

g

The data Emergent modularityFunctional specialisation may be an outcome of a developmental process rather than a precursor – i.e., no innate modularity

Brain activity in early infancy – less localised, less specialised for particular stimuli

Input layer (90 units)

Output layer (100 units)

(20 units)DirectRoute

IndirectRoute

Thomas & Karmiloff-Smith (2002)

Page 6: Connectionist models of language deficits€¦ · Connectionist models of language deficits Fiona Richardson …remember Dell et al. (1997) ... processing resource common to regulars

66

Emergent modularity and the past tense

i.e. for past tense, regulars and irregulars can (partially) specialise to different parts of the system across training due to structure-function correspondences

Performance during training

0%

20%

40%

60%

80%

100%

Regular EP1 EP2 EP3f Rule

Pattern Type

Perc

enta

ge C

orre

ct

Index of Specialisation(Differential impairment caused by single route le sion)

-60%

-40%

-20%

0%

20%

40%

60%

80%

Regular E P1 EP 2 EP 3f Rule

Pattern T ype

Spec

ialis

atio

n to

Dire

ct ro

ute

EP1: no change (i.e. hitEP1: no change (i.e. hit--hit)hit)

EP2: vowel change (i.e. hideEP2: vowel change (i.e. hide--hid)hid)

EP3f: arbitrary (i.e. goEP3f: arbitrary (i.e. go--went went higher token frequencyhigher token frequency))

Thomas & Karmiloff-Smith (2002)

“Double dissociation without modularity” in a reading model: Plaut and Shallice (1993)

There is a double dissociation between concrete and abstract word reading

Most researchers believe that skilled readers rely almost exclusively on the phonological route

Only in cases where this route is inoperative as as in deep and phonological dyslexia, are strong semantic effects such as concreteness observed

Patient CAV exhibited better performance on abstract words (partial reliance on semantic route)

Deep dyslexia

The hallmark: semantic errorsi.e. reading CAT as “dog”

Also…Visual errors: CAT -> cotMixed errors: CAT -> ratMorphological: GOES -> go

The Plaut and Shallice reading model (1993)

An “attractor” network

Lesioned normal model by randomly removing connections

LocationsSeverity

Lesion location and behaviour Lesion severity and behaviour

Page 7: Connectionist models of language deficits€¦ · Connectionist models of language deficits Fiona Richardson …remember Dell et al. (1997) ... processing resource common to regulars

77

Functional specialisation

Different parts of the network have a different function: pointers and valleys

Abstract words are assumed to have fewer semantic features (sparser semantic neighbourhood)

Concrete relies more on valleys, abstract more on pointers

Attractor space: pointers and valleys

attractorattractor

basinbasin

Implications: Plaut (1995)“…Both pathways are involved in processing both types of words. However, they make different contributions the course of this processing…”

The direct pathway generates an initial approximation of the semanticsThese are refined by the clean-up pathway

Functional specialisation: this exists in the network “but does not directly correspond to the observed behavioural effects under damage (abstract vs. concrete words)”

“[Regarding Averaged vs. Rare lesions]… the occasional lesion of each type may produce effects that are exactly opposite to those produced by most quantitatively equivalent lesions

The observation of a double dissociation does not even indicate functional specialisation, as Shallice (1988) suggests, for how can the same portion of a mechanism be “specialised” in two different ways?”

NoteConcrete-Abstract double dissociation appears to violate Sartori’s (1988) argument that double dissociations cannot arise from serial stages

Clean-up pathway appears to follow Direct pathway in a serial fashion

Either parallel processing permits this, or the two parts of network do not conform to independent stages (see McClelland, 1979)

Connectionism and modularity

Connectionist approaches to cognitive processing are current highly modular, indeed consistent with innate modularity

the “past tense network”

Yet connectionism typically associated withequipotentiality - whole brain is a homogeneous distributed network

But…Even single networks aren’t equipotential - the way you damage the processing resource affects the single dissociations produced

Diffuse

Focal Dissociations from Focal vs. Diffuse damage

0%

20%

40%

60%

80%

100%

Focal Diffuse Focal Diffuse

10/40 40/160

% C

orre

ct a

fter l

esio

n

RegularEP3f

Page 8: Connectionist models of language deficits€¦ · Connectionist models of language deficits Fiona Richardson …remember Dell et al. (1997) ... processing resource common to regulars

88

And…“A mixture of experts”: specialised systems can emerge from slight computational differences in processing components + initial biases of connectivity + experience-driven competition

Different tasks may recruit different sub-sets of experts, some only oneMany processing principles may be domain-general

This research is at an early stage in developmental cognitive neuroscience

Conclusions (1): modelling

Models can be used to explore “how”and “why” behavioural dissociations emerge

Theory development tool

Conclusions (2): behaviour and modularity

Double dissociations do not necessarily imply a modular system

Dissociations imply functional specialisation

Distributed systems may show chance unusual dissociations – not clear whether this is an artefact of model simplifications

Specialisation of function may be emergent

END

Questions?