Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins...

Ariadna QuattoniXavier Carreras

An Efficient Projection for l1,∞ Regularization

Michael Collins Trevor Darrell

MIT CSAIL

Goal : Efficient training of jointly sparse models in high dimensional spaces.

Joint Sparsity

Why? : Learn from fewer

examples. Build more efficient

classifiers. Interpretability.

Church Airport Grocery Store

Flower-Shop

11w 11w

Flower-Shop

11w 11w

Flower-Shop

11w 11w

l1,∞ Regularization How do we promote joint (i.e. row) sparsity ?

Coefficients forfeature 2

Coefficients for task 2

An l1 norm on the maximum absolute values of the coefficients across tasks promotes sparsity.

Use few features

The l∞ norm on each row promotes non-sparsity on each row.

Share parameters

Contributions An efficient projected gradient method for l1,∞ regularization

Our projection works on O(n log n) time, same cost as l1 projection

Experiments in Multitask image classification problems

We can discover jointly sparse solutions

l1,∞ regularization leads to better performance than l2 and l1 regularization

Multitask Application

},...,,{ 21 mDDDD

Joint SparseApproximation

1D2D mD

Collection of Tasks

l1,∞ Regularization: Constrained Convex

Optimization Formulation

We use a Projected SubGradient method. Main advantages: simple, scalable, guaranteed convergence rates.

A convex function

Convex constraints

Projected SubGradient methods have been recently proposed:

l2 regularization, i.e. SVM [Shalev-Shwartz et al. 2007]

l1 regularization [Duchi et al. 2008]

Euclidean Projection into the l1-∞ ball

Characterization of the solution

Series1

Characterization of the solution

Feature I Feature II Feature III Feature IV

Mapping to a simpler problem We can map the projection problem to the following problem which finds the optimal maximums μ:

Efficient Algorithm

Se-ries

Efficient Algorithm

Se-ries

Complexity

The total cost of the algorithm is dominated by sorting the entries of A.

The total cost is in the order of:

Synthetic Experiments

Generate a jointly sparse parameter matrix W:

For every task we generate pairs:where:

We compared three different types of regularization :

l1,∞ projection l1 projection l2 projection

Synthetic Experiments

Dataset: Image Annotation

40 top content words Raw image representation: Vocabulary Tree(Nister and Stewenius 2006)

11000 dimensions

president actress team

Results

Most of the differences are statistically significant

Results

Dataset: Indoor Scene Recognition

67 indoor scenes. Raw image representation: similarities to a set of unlabeled images. 2000 dimensions.

bakery bar Train station

Results

Conclusions

We proposed an efficient global optimization algorithm for l1,∞ regularization.

We presented experiments on image classification tasks and shown that our method can recover jointly sparse solutions.

A simple an efficient tool to implement an l1,∞ penalty, similar to standard l1 and l2 penalties.

Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins...

Documents

Laura y Ariadna

Learning Data Representations with “Partial Supervision” Ariadna Quattoni

Pregel - MIT CSAIL

Ariadna Superstar

Ariadna Hormigo B2A

Ariadna Brochure

MARAGADA ARIADNA

Photobook Irane Ariadna

MOTIVACIÓN PERSONAL ARIADNA

Object Recognition with Latent Conditional Random Fieldspeople.csail.mit.edu/ariadna/mthesis.pdf · 2006-06-21 · Ariadna Quattoni Submitted to the Department of Electrical Engineering

Transfer Learning Algorithms for Image Classiﬁcationaquattoni/AllMyPapers/phd_thesis.pdf · 2009-05-21 · Transfer Learning Algorithms for Image Classiﬁcation by Ariadna Quattoni

ARIADNA - ID manual

Revista Ariadna

Las estaciones ariadna

Ariadna Parramon

Jamestown ariadna andreis

ARIADNA FORMACIÓN

Eduard, georgina, ariadna

Ariadna Y Teseo

Lluna, Ariadna & Albert