View
235
Download
2
Category
Tags:
Preview:
DESCRIPTION
Categori Zat i On
Citation preview
Tom Griffiths
CogSci 131 Models of categorization
Spaces and features • Will show up in many contexts in this class
– similarity – semantic representation – categorization – neural networks
• How can we use these representations?
Categorization
Outline
Prototype and exemplar theories
Break
Testing the theories
How can we explain typicality?
• One answer: reject definitions, and have a new representation for categories
• Prototype theory: – categories are represented by a prototype – other members share a family resemblance
relation to the prototype – typicality is a function of similarity to the
prototype
Family resemblance
Prototype
Family resemblance
(Posner & Keele, 1968)
Prototype
Posner and Keele (1968) • Prototype effect in categorization accuracy • Constructed categories by perturbing
prototypical dot arrays • Ordering of categorization accuracy at test:
– old exemplars – prototypes – new exemplars
Formalizing prototype theories Representation:
Each category (e.g., A, B) has a corresponding prototype (µA,µB)
Categorization: (for a new stimulus x)
Choose category that minimizes (maximizes) the distance (similarity) from x to its prototype
(e.g., Reed, 1972)
Formalizing prototype theories Prototype is most frequent or “typical” member
Spaces (Binary) Features Prototype e.g., average of members of category Distance e.g., Euclidean distance
Prototype e.g., binary vector with most frequent feature values Distance e.g., Hamming distance
€
d(x,π A ) = (xk −µA ,k )2
k∑%
& '
(
) *
1/ 2
€
d(x,π A ) = xk −µA ,kk∑
Distances
Euclidean distance
01100100111 01110100101
Hamming distance
Formalizing prototype theories
Category A Category B
Prototypes (category means)
Decision boundary at equal distance (always a straight line for two categories)
More complex prototypes • Various extensions to simple prototype
models have been explored… • For features, configural cue models
– compound features, such as “red and small” – results in combinatorial explosion
• For spaces, prototype models that incorporate information about variance – category-specific measures of distance
More complex prototypes
Category A Category B
Prototypes (category means)
Decision boundary at equal distance (no longer a straight line)
More complex prototypes Decision boundary at equal distance
(no longer a straight line)
Boundaries are conic sections (parabolas, ellipses, and hyperbolas)
Predicting prototype effects • Prototype effects are built into the model:
– assume categorization becomes easier as proximity to the prototype increases…
– …or distance from the boundary increases • But what about the old exemplar advantage?
(Posner & Keele, 1968)
• Prototype models are not the only way to get prototype effects…
Exemplar theories
Store every member (“exemplar”) of the family
Formalizing exemplar theories Representation:
A set of stored exemplars y1, y2, …, yn, each with its own category label
Categorization: (for a new stimulus x) Choose category A with probability
€
P(A | x) =
βA ηxyy∈A∑
βA ηxy + βB ηxyy∈B∑
y∈A∑
“Luce-Shepard choice rule”
ηxy is similarity of x to y
βA is bias towards A
The context model (Medin & Schaffer, 1978)
Defined for stimuli with binary features (color, form, size, number)
1111 = (red, triangle, big, one) 0000 = (green, circle, small, two)
€
ηxy = ηxykk∏
Define similarity as the product of similarity on each dimension
€
ηxyk =1 xk = yksk otherwise# $ %
Prototypes vs. exemplars • Exemplar models produce prototype effects
– if prototype minimizes distance to all exemplars in a category, then it has high probability
• Also predicts old exemplar advantage – being close (or identical) to an old exemplar of
the category gives high probability • Predicts new effects prototype models
cannot produce… – stimuli close to an old exemplar should have high
probability, even far from the prototype
Break
Up next: Testing the theories
Prototypes vs. exemplars • Exemplar models produce prototype effects
– if prototype minimizes distance to all exemplars in a category, then it has high probability
• Also predicts old exemplar advantage – being close (or identical) to an old exemplar of
the category gives high probability • Predicts new effects prototype models
cannot produce… – stimuli close to an old exemplar should have high
probability, even far from the prototype
The 5-4 category structure (Medin & Schaffer, 1978)
Category A Category B
Prototype Prototype
d(x,µ) d(x,µ) 1
2
1
1
1
2
2
1
0
The 5-4 category structure (Medin & Schaffer, 1978)
Category A Category B
Prototype Prototype
d(x,µ) d(x,µ) 1
2
1
1
1
2
2
1
0
Prototype: P(A|4) > P(A|7) Exemplar: P(A|4) < P(A|7)
“4”
“7”
The generalized context model (Nosofsky, 1986)
Defined for stimuli in psychological space
€
ηxy = exp{−cd(x,y)p}
€
P(A | x) =
βA ηxyy∈A∑
βA ηxy + βB ηxyy∈B∑
y∈A∑
c is “specificity” p = 1 is exponential p = 2 is Gaussian
where
The generalized context model
Category A Category B
Decision boundary determined by exemplars
90% A
10% A
50% A
Category A Category B
Prototypes vs. exemplars Exemplar models can capture complex boundaries
Prototypes vs. exemplars Exemplar models can capture complex boundaries
Bells and whistles: distance metrics
€
d(x,y) = wk (xk − yk )r
k∑$
% &
'
( )
1/ rThe “weighted Minkowski r metric”:
where r determines the metric (Euclidean or city-block) wk is the weight of dimension k (reflects attention)
Using different metrics r = 2: Euclidean distance r = 1: city-block distance
“integral” dimensions e.g. saturation & brightness
“separable” dimensions e.g. size & shape
Dimensional attention
Allows rescaling of dimensions to aid in
categorization
(similar to capturing the variance of a category)
Evaluating models • Both prototype and exemplar models have
lots of free parameters – prototype locations – response biases, attention weights, r, p, c
• Testing the models typically involves finding the best-fitting values of the parameters – generic optimization methods (like gradient
descent) are usually used to do this
Some questions… • Both prototype and exemplar models seem
reasonable… are they “rational”? – are they solutions to the computational problem?
• Should we use prototypes, or exemplars?
• How can we define other models that handle more complex categorization problems?
• Is this all that categories are?
Next week
• Tuesday: Linear algebra – a way of computing with spaces
• Thursday: Semantic networks – using some linear algebra! – Google and the mind…
Recommended