Visual Expertise Is a General Skill Maki Sugimoto University of California, San Diego November 20, 2000

Visual Expertise Is a General Skill

Maki Sugimoto

University of California, San Diego

November 20, 2000

Overview

Is the Fusiform Face Area (FFA) really a face specific area?

Our results support the view that it is NOT

• Motivation :Evidence for/against the face specific view

• Our Approach :Our model and experimental design

• Results• Conclusion

Motivation: Evidence for the Face Specific View

• Prosopagnosia patients may have deficit in identifying individual faces but normal in detecting faces or other non-face objects, while visual object agnosia patients may be normal with face recognition but have deficit with reading or object recognition.

• Recognition of faces is more sensitive to configural changes than objects.

Face and non-face objects have separate processing mechanisms

Motivation: Evidence against the face specific view

• Gauthier et al. points out faces and objects differ not only in the image geometries, but also in …

1. Level of discrimination 2. Level of experienceWe are face “experts”.

• FFA showed high activation for a wide variety of non-objects when these two conditions were controlled.

Greeble Experts (Gauthier et al. 1999)

• Activation of the FFA increased when Greebles were presented as the training proceeded.

• When subjects met the criteria of experts, the activation level differences between faces and Greebles were insignificant.

Our Hypothesis

Why does the FFA engage in expert classification of non-face objects as well?

• We hypothesized that the FFA responds to visual features that are generally useful in discriminating homogeneous input images.

• Expertise on one class should facilitate the learning of other expert tasks.

Model

• Pretrain two groups of neural networks on different tasks.

• Compare the abilities to learn a new individual Greeble classification task.

cup

Carol

book

can

BobTed

cancupbookface

Hidden layer

Greeble1Greeble2Greeble3

Greeble1Greeble2Greeble3

(Experts)

(Non-experts)

Database

• 64x64 8bit grayscale• 5 basic categories• 12 individuals per

category• 5 different images

per individual• Total of 5x12x5=300

images

Preprocessing

• Form Gabor jets using 8 orientations and 5 scales, then subsample on a 8x8 grid.

• For each scale, apply PCA separately and reduce dimensionality to 8.

• 8x5x64=2560 5x8=40 dimensions

Experimental Setting Details

Fixed configurations:• 1 hidden layer with

40 units• Learning rate = .005• Momentum = .5

Controlled condition:• 30 training patterns

for each basic class

cancupbookface

cancupbookface1face2face3

face10

(Experts)

(Non-experts)

Pretraining tasks

Training Set Variations

Training set:• 10 individual for each

class, 3 images for each individual

Hold out / test set:• Basic level: 3 images of

unseen individual• Individual level: 1

unseen image of each individual

Training set Holdout Test

(indiv.)

10

Holdout

Test(basic)

Topics for Analysis

• Will the experts learn the new task faster?

• Is there a correlation between network plasticity and the speed to learn the new task?

Network plasticity can be defined as the average “slope” of the hidden layer units.

Criteria to Stop Training

1. Fixed RMSE threshold

2. Number of training epochs

3. Best holdout set performance

• We eliminated the third criterion due to extremely high variance in number of training epochs observed in preliminary studies.

Experiment 1 (Design)

• For each pretraining task, 20 networks were trained, i.e. 20 book experts, 20 face experts, etc.

• Training set RMSE threshold was fixed to:– .08 (pretraining)– .158 (after new task added)

• The threholds were derived from preliminary cross-validation experiments on the most difficult task, i.e. face expert classification.

Experiment 1 (Results)

• Pretraining tasks were much harder for the experts.

• Non-experts were significantly slower in learning the new task than any of the experts.

Number of Epochs to Achieve RMSE Threshold

Experiment 2 (Design)

• For each pretraining task, 10 networks for trained for 5120 epochs.

• Intermediate weights were recorded at epochs 5,10, 20, ... , 2560.

• Total of 11x10=110 networks were trained on the new task with RMSE threshold .158.


• Non-experts were the slowest to learn the new task, provided that the pretraining tasks were fully acquired.

Pretraining Epochs and the Speed to Learn New Task


• If the pretraining were stopped prematurely, the networks must continue improving on the pretrained classes as well as the new task.

Pretraining RMSE

Analysis of Network Plasticity

• Network plasticity can be defined as the average “slope” across all hidden layer units and all patterns in a given set of patterns:

where

Network Plasticity

• Plasticity was lower for the experts!• It is more appropriate to interpret plasticity as

a measurement of mismatch.

Conclusion

• Expert networks learned the new task faster.• Expert networks learned faster despite the

low plasticity, further supporting the claim that the hidden layer representation developed for one expert class are general features that are useful for classifying other classes as well.

Visual expertise is a general skill that is not specific to any class of images including faces.

Documents

Visual Expertise Is a General Skill Maki Sugimoto University of California, San Diego November 20, 2000