63
Standard Brain Model for Visio The talk is given by Tomer Livne and Maria Zeldin

Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Embed Size (px)

Citation preview

Page 1: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Standard Brain Model for Vision

The talk is given by Tomer Livne and Maria Zeldin

Page 2: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Overview

Introduction to biological basis of vision

Computer analogy to biology Implementation Discussion

Page 3: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Overview of biological vision

Hierarchical structure From simple features to complex

ones (Hubel & Weisel) Increased invariance

Page 4: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Hubel and Weisel (1962, 1965) following experimental results proposed a model in which neighbouring simple cells are combined into complex cell.

The result is complex cells with phase independence.

The basic idea

Page 5: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Max vs. sum pooling

Electrophysiological results indicate that pooling may not be linear, the response of a complex cell can be best described by the activity of its maximal afferent.

Page 6: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

From simple to complex cells:

Page 7: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

A straightforward extension of this is to start with simple cells and end up with “higher-order-hyper-complex cells”.

This is the basis for all the hierarchy idea!

Page 8: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

The hierarchy based on the brain model:

Hierarchical models of object recognition in cortex. Reisenhuber and Poggio. Nature, november 1999.

Page 9: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Clearer explanation of the hierarchy

-

|

\

/

1

0

0

0

0.7

0.7

0

0

1

0.7

0

0

-

|

\

/

Simple cells

Complex cells

Max pooling

orientations

Page 10: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Computer vision Usual approach – image patching Biological motivated approach -

hierarchy

Page 11: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Representing objects by invariant complex features

Hierarchical models of object recognition in cortex. Reisenhuber and Poggio. Nature, november 1999.

The IT area in the brain is dealing with object recognition. In this area there are cells that respond best to a specific object

Page 12: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Recognize the same faces

Page 13: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

In the previous task our brains did a very good job in recognizing same face even thou the scale, impression, illumination were different.

And did not classified different faces as same even thou they have similar physical conditions

Page 14: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Motivation

The presented approach is trying to implement into a computer system the hierarchical idea that was presented. In order to achieve similar robustness.

Page 15: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

The models that we present deal with more general problem which is object classification.

We can say that the problem of recognition of different transformations of an object is similar to the problem of classification.

Page 16: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Reisenhuber & Poggio (1999) demonstrate that it can.

Comparing electrophysiological results from cells in the monkey brain with implemented hierarchical model.

Can computers reach similar properties to biology?

Page 17: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Training stage:

The monkey was trained to recognize restricted set of views of unfamiliar target stimuli resembling paperclips. They check which IT cell responds best to all views. After finding the cell that responded the most was picked for the study.

Page 18: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Test stage:

The best reaction of the cell was to the trained data.The second best was to new transformations of the trained object. And very little response to new objects (distractors)

Page 19: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Learning the results:

Hierarchical models of object recognition in cortex. Reisenhuber and Poggio. Nature America Inc, november 1999.

Page 20: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

The hierarchy based on the brain model:

Hierarchical models of object recognition in cortex. Reisenhuber and Poggio. Nature, november 1999.

We saw this part

Now lets compare it to the model

Page 21: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Hierarchical models of object recognition in cortex. Reisenhuber and Poggio. Nature America Inc, november 1999.

Page 22: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Hierarchical models of object recognition in cortex. Reisenhuber and Poggio. Nature America Inc, november 1999.

Results of scrambling

Page 23: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Goal- brain based object classification

Biology view of the problem

implementation of hierarchical structure

comparing true results to model results

Summary

Page 24: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Models based on the hierarchical idea we already discussed

Riesenhuber & Poggio (1999)

Serre & Riesenhuber (2004)

Serre, Wolf, Bileschi, Riesenhuber, & Poggio (2007)

Mutch & Lowe (2006)

Modifications of the basic ideas

limitations and shortcomings

What’s next?

Page 25: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Method #1Riesenhuber & Poggio , ”Hierarchical models of objects recognition in cortex”, Nature 1999

Later it was modified by Serre, Wolf, Bileschi, Riesenhuber, & Poggio, “Robust object recognition with cortex-like mechanisms”, 2007.

Page 26: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin
Page 27: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

S1 – Gabor filters 16 different sizes (7X7, 9X9,…,37X37) 4 orientations A total of 64 S1 type detectors

Robust object recognition with cortex-like mechanisms. Serre, Wolf, Bileschi, Reisenhuber and Poggio. IEEE, march 2007.

Page 28: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

A serial implementation of filtering

Page 29: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

C1 – MAX pooling 8 different sizes (8X8, 10X10,…,22X22) 4 orientations A total of 32 C1 type detectors Used to define features during the learning stage

Robust object recognition with cortex-like mechanisms. Serre, Wolf, Bileschi, Reisenhuber and Poggio. IEEE, march 2007.

Page 30: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

S2 – learned features Holds N learned features 4 patch sizes (4X4, 8X8, 12X12, 16X16)

indicating how many C1 neighboring cells are considered (this is done separately for each C1 scale)

For each image patch X, a Gaussian radial basis function that depends on an Euclidean distance, is calculated from each of the stored features Pi (i=1:N) r=exp(-β ||X – Pi||²)

Page 31: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Robust object recognition with cortex-like mechanisms. Serre, Wolf, Bileschi, Reisenhuber and Poggio. IEEE, march 2007.

Page 32: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

C2 – max pooling For each stored feature the best match (closest)

Classifier Classification is based on both C1 and C2

Robust object recognition with cortex-like mechanisms. Serre, Wolf, Bileschi, Reisenhuber and Poggio. IEEE, march 2007.

Page 33: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Summery

4 Layers of processing 2 types of operations (Max, Sum) Output – N dimensional vector

Page 34: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Model’s performance

Testing the model Defining features Flexibility of the design

Page 35: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Robustness to background

Ignoring presented unrelated data Training and test images contains both targets and

distractors Performed best with C2 type detectors Simple detection – present/absent (no location information) Approaches maximal performance with 1000-5000 features Performance improve with increased training (more

examples)

Robust object recognition with cortex-like mechanisms. Serre, Wolf, Bileschi, Reisenhuber and Poggio. IEEE, march 2007.

Page 36: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Object specific features or a universal dictionary A Universal dictionary based system is good for

small training sets (10,000 features) An object specific based system is better when

using large training sets (improves with practice – increased number of features [200 an image])

Page 37: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Robust object recognition with cortex-like mechanisms. Serre, Wolf, Bileschi, Reisenhuber and Poggio. IEEE, march 2007.

Page 38: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Object recognition without a clutter

Scene understanding using a windowing strategy

Large inter-category variability

Training sets of only either positive (target) or negative (no target)

2 classification systems: C1 and C2 based

C1 based system performs better (able to efficiently represent objects’ boundaries)

Page 39: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Robust object recognition with cortex-like mechanisms. Serre, Wolf, Bileschi, Reisenhuber and Poggio. IEEE, march 2007.

Page 40: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Texture based objects

Again C1 and C2 based classifiers C2 features are now evaluated only locally,

not over all image locations C2 based classification is better (the

features are more invariant and complex) Evaluated by correct labeling of pixels in

the image

Robust object recognition with cortex-like mechanisms. Serre, Wolf, Bileschi, Reisenhuber and Poggio. IEEE, march 2007.

Page 41: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

A unified system – looking at multiple processing levels The hierarchical nature of the described

system enables the use of multiple levels of feature

Recognizing both shape and texture based objects in the same image

Two processing pathways

Page 42: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Robust object recognition with cortex-like mechanisms. Serre, Wolf, Bileschi, Reisenhuber and Poggio. IEEE, march 2007.

Page 43: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Scene understanding task

Complex scene understanding requires more than just detection of objects, location information of the detected objects is also required

Shape-based objects C1 based classification, using a windowing approach, for

both identification and localization Local neighborhood suppression by the maximal

detected result Texture-based objects

C2 based classification texture boundaries posses a problem (solved by

additionally segmenting the image and averaging the responses within each segment)

Page 44: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Model summery

Hierarchical design Efficiency Multiple processing pathways Universality Vs. specificity Limitations

Page 45: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Method #2

Mutch & Lowe Multiclass Object Recognition with Sparse, Localized Features. 2006.

Page 46: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Image scaling – 10 scales

Multiclass Object Recognition with Sparse, Localized Features. By Mutch & Lowe. IEEE 2006

S1 – Gabor filters Single scale (11X11) 4 orientations applied to every

location Evaluated at all

possible locations

Page 47: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

C1 – local invariance Max pooling using a

10X10(size)X2(scale) filter

Each orientation is tested separately

used to define features during the learning stage

Larger skips

Multiclass Object Recognition with Sparse, Localized Features. By Mutch & Lowe. IEEE 2006

Page 48: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

S2 – intermediate features 4 filter sizes (4X4,

8X8, 12X12, 16X16) defined by the stored features

A Universal feature set

Response to each filter (feature) is calculated as

R(X,P) = exp[-(||X – P||²)/2σ²α]

Multiclass Object Recognition with Sparse, Localized Features. By Mutch & Lowe. IEEE 2006

Page 49: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

C2 – Global invariance A vector of size d of

the maximal response (anywhere in the image) to each feature.

SVM classifier Majority-voting

based decision

Multiclass Object Recognition with Sparse, Localized Features. By Mutch & Lowe. IEEE 2006

Page 50: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

The overall look on all the stages:

Multiclass Object Recognition with Sparse, Localized Features. By Mutch & Lowe. IEEE 2006

Page 51: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Summary

Similar assumptions Differences in construction

Page 52: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Model performance and improvements

Testing classification More biologically motivated

improvements

Page 53: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Tests classification

101 categories (from Caltech101) Trained sets of 15 (or 30) images of each

category Learn random features (in both size and

location), an equal number for each category Construct C2 vectors Train the SVM (on the improved model also

perform feature selection) Test stage

Page 54: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Results of the test:

Multiclass Object Recognition with Sparse, Localized Features. By Mutch & Lowe. IEEE 2006

Page 55: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

S2 – encodes only the dominant orientation at each location.

Increased number of tested orientations (from 4 to 12)

Lateral inhibition – suppressing below threshold filter outputs in S1 & C1 layers

Limited S2 invariance – in order to allow for preserving a certain amount of geometrical relations, S2 feature are limited to certain places in the image (relative to the center of the object)

Select only good features for classification

To get better results, some improvements were added to the model:

Page 56: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Running the previous test on the improved model lead to the following results:

Multiclass Object Recognition with Sparse, Localized Features. By Mutch & Lowe. IEEE 2006

Page 57: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Refining the model

Multiclass Object Recognition with Sparse, Localized Features. By Mutch & Lowe. IEEE 2006

Page 58: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Testsdetection/localization Sliding window Merging

overlapping detections

Single/multiple scale test images

Multiclass Object Recognition with Sparse, Localized Features. By Mutch & Lowe. IEEE 2006

Page 59: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Summery

Efficiency Improvements Limitations

Page 60: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

THE END

Thank you for listening!

Page 61: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

Simple cell is an early visual neuron meaning it responds best to a line of a specific size, orientation, and phase.

This cell responds best to 90 deg. phase.

This cell responds best to 180 deg. phase.

Page 62: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

back

Page 63: Standard Brain Model for Vision The talk is given by Tomer Livne and Maria Zeldin

back

Image

Simple cell (phase sensitive)

Complex cell (phase insensitive)