35
Yair Weiss, CS HUJI Daphna Weinshall, CS HUJI Amnon Shashua, CS HUJI Yonina Eldar, EE Technion Ron Meir, EE Technion

Sparse, Brain-Inspired Representations  for Visual Recognition

  • Upload
    morley

  • View
    20

  • Download
    0

Embed Size (px)

DESCRIPTION

Sparse, Brain-Inspired Representations  for Visual Recognition. Yair Weiss, CS HUJI Daphna Weinshall, CS HUJI Amnon Shashua, CS HUJI Yonina Eldar, EE Technion Ron Meir, EE Technion. Project Mission. - PowerPoint PPT Presentation

Citation preview

Page 1: Sparse, Brain-Inspired Representations  for Visual Recognition

Yair Weiss, CS HUJIDaphna Weinshall, CS HUJIAmnon Shashua, CS HUJIYonina Eldar, EE TechnionRon Meir, EE Technion

Page 2: Sparse, Brain-Inspired Representations  for Visual Recognition

The human brain can rapidly recognize thousands of objects while using less power than modern computers use in “quiet mode”. Although there are many neurons devoted to visual recognition, only a tiny fraction fire at any given moment. Can we use machine learning to learn such representations and build systems with such performance?

We believe a key component enabling this remarkable performance is the use of sparse representations, and seek to develop brain-inspired hierarchical representations for visual recognition.

Page 3: Sparse, Brain-Inspired Representations  for Visual Recognition

1.1. Low power visual recognition - sparsity before the Low power visual recognition - sparsity before the

A2D:A2D: algorithms and theories for sparsification in the analog domain (Weiss & Eldar)

2.2. Extracting Informative Features from Sensory InputExtracting Informative Features from Sensory Input:  an approach based on slowness (Meir & Eldar)

3.3. Sparsity at all levels of the hierarchySparsity at all levels of the hierarchy: algorithms for learning hierarchical sparse representations - from the input to the top levels (Weinshall & Shahsua)

Page 4: Sparse, Brain-Inspired Representations  for Visual Recognition

• Research direction: Compressed Sensing for low Research direction: Compressed Sensing for low power cameras power cameras

• We have shown that random projections are poor fits for compressed sensing of natural images. We are working on better linear projections that take advantage of image statistics.

• We want to explore nonlinear compressed sensing.

• Optimizing projections for recognition, not for visual reconstruction.

Weiss & Eldar

Page 5: Sparse, Brain-Inspired Representations  for Visual Recognition

Sensory signals effectively represent environmental signals Slow Feature Analysis extracts features based on slowness The approach has been applied successfully to generate

biologically plausible features, blind source separation and pattern recognition.

The SFA does not deal directly with representational accuracy

We formulate a generalized multi-objective criterion balancing representational and temporal reliability

Obtain feasible objective for optimization Preliminary results demonstrate advantages over SFA Future work: robustness, online learning, distributed

implementation through local learning rules

Ron Meir

Page 6: Sparse, Brain-Inspired Representations  for Visual Recognition

Instance-based typically relies on similarity metrics

class-based recognition typically relies on statistical learning

Our goal: develop a unifieid statistical learning framework for both recognition tasks

Cohen & Shashua object classes

object instances

Page 7: Sparse, Brain-Inspired Representations  for Visual Recognition

Cognitive psychology: Basic-Level Category (Rosch 1976). Intermediate category level which is learnt faster and earlier as compared to other levels in the category hierarchy

Neurophysiology: Agglomerative clustering of responses taken from population of neurons within the IT of macaque monkeys resembles an intuitive hierarchy (Kiani et al. 2007)

Page 8: Sparse, Brain-Inspired Representations  for Visual Recognition
Page 9: Sparse, Brain-Inspired Representations  for Visual Recognition

Goal: jointly learn classifiers for a few tasks

Implicit goal: information sharing◦ Achieve more economical overall representation◦ A way to enhance impoverished training data ◦ Knowledge transfer (learning to learn)

Our method: share information hierarchically in a cascade, whose levels are automatically discovered

Publication: Regularization Cascade for Joint Learning, Alon Zweig and Daphna Weinshall, ICML, June 2013

Page 10: Sparse, Brain-Inspired Representations  for Visual Recognition

How we compute the classifiers?

Build classifiers for all tasks, each is a linear combination of classifiers computed in a cascade◦ Higher levels – high incentive for information sharing

more tasks participate, classifiers are less precise◦ Lower levels – low incentive for sharing

fewer tasks participate, classifiers get more precise

How we control the incentive to share? vary regularization of loss function

Page 11: Sparse, Brain-Inspired Representations  for Visual Recognition

Regularization: ◦ restrict the number of features the classifiers can

use by imposing sparse regularization - || • ||1

◦ add another sparse regularization term which does not penalize for joint features - || • ||1,2

λ|| • ||1,2 + (1- λ )|| • ||1

Incentive to share:◦ λ=1 highest incentive to share◦ λ=0 no incentive to share

11

Page 12: Sparse, Brain-Inspired Representations  for Visual Recognition

Eagle Owl Asian Elephant

African Elephant

Head

Legs

Wings

Long Beak

Short Beak

Trunk

Short Ears

Long Ears

12

Page 13: Sparse, Brain-Inspired Representations  for Visual Recognition

Head

Legs

Wings

Long Beak

Short Beak

Trunk

Short Ears

Long Ears

=

+ +

13

Page 14: Sparse, Brain-Inspired Representations  for Visual Recognition

14

Loss + || • ||12

Loss + λ|| • ||1,2 + (1- λ )|| • ||1

Loss + || • ||1

Page 15: Sparse, Brain-Inspired Representations  for Visual Recognition

15

• We train a linear classifier in Multi-task and multi-class settings, as defined by the respective loss function

• Iterative algorithm over the basic step:

ϴ = {W,b}ϴ’ stands for the parameters learnt up till the current stepλ governs level of sharing from max sharing λ = 0 to none λ = 1

• Each step λ is increasedThe aggregated parameters plus the decreased level of sharing is intended to guide the learning to focus on more task/class specific information as compared to the previous step

Page 16: Sparse, Brain-Inspired Representations  for Visual Recognition

Synthetic and real data (many sets) Multi-task and multi-class loss functions

Low level features vs. high level features Compare the cascade approach against the

same algorithm with: No regularization

L1 sparse regularization

L12 multi-task regularization

Multi-task loss

Multi-class loss

Page 17: Sparse, Brain-Inspired Representations  for Visual Recognition

NoReg L12

17

1

2

74

3

5 6

T1 T2 T3 T4

Page 18: Sparse, Brain-Inspired Representations  for Visual Recognition

H L1H L1

100 tasks. 20 positive sample and 20 negative samples per task.

18

Page 19: Sparse, Brain-Inspired Representations  for Visual Recognition

Step 1 output

100 tasks. 20 positive sample and 20 negative samples per task.

19

Page 20: Sparse, Brain-Inspired Representations  for Visual Recognition

Step 2 output

100 tasks. 20 positive sample and 20 negative samples per task.

20

Page 21: Sparse, Brain-Inspired Representations  for Visual Recognition

Step 3 output

100 tasks. 20 positive sample and 20 negative samples per task.

21

Page 22: Sparse, Brain-Inspired Representations  for Visual Recognition

Step 4 output

100 tasks. 20 positive sample and 20 negative samples per task.

22

Page 23: Sparse, Brain-Inspired Representations  for Visual Recognition

Step 5 output

100 tasks. 20 positive sample and 20 negative samples per task.

23

Page 24: Sparse, Brain-Inspired Representations  for Visual Recognition

Sample size

Ave

rage

acc

urac

y

24

Page 25: Sparse, Brain-Inspired Representations  for Visual Recognition

Sample size

Ave

rage

acc

urac

y

Page 26: Sparse, Brain-Inspired Representations  for Visual Recognition

Caltech 101

Cifar-100 (subset of tiny images)

Imagenet

Caltech 256Datasets

26

Page 27: Sparse, Brain-Inspired Representations  for Visual Recognition

Datasets

27

MIT-Indoor-Scene (annotated with label-me)

Page 28: Sparse, Brain-Inspired Representations  for Visual Recognition

Representation for sparse hierarchical sharing: low-level vs. mid-level

o Low level features: any of the images features which are computed from the image via some local or global operator, such as Gist or Sift.

o Mid level features: features capturing some semantic notion. Classifiers over low level features.

28

Page 29: Sparse, Brain-Inspired Representations  for Visual Recognition

29

Cifar-100

MIT indoor scene, ObjBankmulti class

Caltech 256, Gehlermulti task

Page 30: Sparse, Brain-Inspired Representations  for Visual Recognition

• Main objective: faster learning algorithm for dealing with larger dataset (more classes, more samples)

• Iterate over original algorithm for each new sample, where each level uses the current value of the previous level

• Solve each step of the algorithm using the online version presented in “Online learning for group Lasso”, Yang et al. 2011

(we proved regret convergence)

Page 31: Sparse, Brain-Inspired Representations  for Visual Recognition

31

• Experiment on 1000 classes from Imagenet with 3000 samples per class and 21000 features per sample (ILSVRC2010)

Acc

urac

y

data repetitions

Top1 Top2 Top3 Top4 Top5

H 0.285 0.365 0.403 0.434 0.456

Zhao et al.

0.221 0.302 0.366 0.411 0.435

Page 32: Sparse, Brain-Inspired Representations  for Visual Recognition

A different setting for sharing: share information between pre-trained models and a new learning task (typically small sample settings).

Extension of both batch and online algorithms, but online extension is more natural

Gets as input the implicit hierarchy computed during training with the known classes

When given examples from a new task:◦ The online learning algorithms continues from where it stopped◦ The matrix of weights is enlarged to include the new task, and the

weights of the new task are initialized◦ Sub-gradients of known classes are not changed

Page 33: Sparse, Brain-Inspired Representations  for Visual Recognition

= + +

+ + + +

Online KT Method Batch KT Method

1 . . . K

= =

K+1K+1 K+1 K+1 α αα πππ

Task 1 Task K

MTL

Page 34: Sparse, Brain-Inspired Representations  for Visual Recognition

34

accu

racy

Sample size

Synthetic data

ILSVRC2010

Page 35: Sparse, Brain-Inspired Representations  for Visual Recognition

35

• We assume hierarchical structure of shared information which is unknown; hierarchy exploitation is implicit.

• Describe a cascade based on varying sparse regularization, for multi-Task/multi-Class and knowledge-transfer algorithms.

• Cascade shows improved performance in all experiments.

• Investigate different visual representation schemes: better value in multi-task learning with higher level features

• Different levels of sharing help and can be efficient.