Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio F e ris, F e b 21, 2013

Class 5: Attributes and Semantic Features

Rogerio Feris, Feb 21, 2013EECS 6890 – Topics in Information Processing

Spring 2013, Columbia Universityhttp://rogerioferis.com/VisualRecognitionAndSearch

http://rogerioferis.com/VisualRecognitionAndSearch
































Thanks for sending the project proposals!

Project update presentations (10 min per group)

March 14

April 11

Details will be provided in the course website

Visual Recognition And Search Columbia University, Spring 2013

Project Report

Plan for Today

Introduction to Semantic Features

Attribute-based Classification and

Search Attributes for Fine-

Grained Classification Relative

Attributes

Project Proposal Presentations


Use the scores of semantic classifiers as high-level features

Off-the-shelfClassifiers …

Score Score Score

Compact / powerfuldescriptor with semanticmeaning (allows“explaining” the decision)

Semantic Features

Beach Classifier


Water ClassifierSand ClassifierSky Classifier

Input Image

Semantic Features

this concept

[John Smith et al, Multimedia Semantic Indexing Using Model Vectors,ICME 2003]

Concatenation / Dimensionality Reduction


Semantic Features (Frame-Level) Illustration of Early IBM work (multimedia community) describing

System evolved to the IBM Multimedia Analysis and RetrievalSystem (IMARS)

[Rong Yan et al, Model-Shared SubspaceBoosting for Multi-label Classification, KDD 2007]Discriminative semantic basis

Rapid event modeling,e.g., “accident with high-speed skidding”


Ensemble Learning

Semantic Features (Frame-level)

Descriptor is formed by concatenating the outputs of weaklytrained classifiers called classemes (trained with noisy labels)

[L. Torresani et al, Efficient Object Category Recognition Using Classemes, ECCV 2010]

Images used to train the “table” classeme (from Google image search)

NoisyLabels


Classemes (Frame-level)

Compact and Efficient Descriptor , useful for large-scale classification

Features are not really semantic!


Classemes (Frame-level)

Object Bank [Li-Jia Li et al, Object Bank: A High-Level Image Representationfor Scene Classification and Semantic Feature Sparsification]

http://vision.stanford.edu/projects/objectbank/

State-of-the-art scene classification results (~7 seconds per image)


Semantic Features (Object Level)
























DescribingNaming

Bald

Beard

Red Shirt

Modifiers rather than (or in addition to) nouns

Semantic properties that are shared among objects

Attributes are category independent and transferrable


?

Semantic Attributes

Attribute-Based Search


Traditional Approaches: Face Recognition (“Naming”)

Face recognition is very challenging under lighting changes, pose variation, and low-resolution imagery (typical conditions in surveillance scenarios)

Attribute-based People Search (“Describing”)

[Vaquero et al, Attribute-based People Search in Surveillance Environments, WACV 2009]

Rather than relying on face recognition only, a complementary people searchframework based on semantic attributes is provided

Query Example:

“Show me all bald people at the 42nd street station last month with dark skin, wearingsunglasses, wearing a red jacket”


People Search in Surveillance Videos





People Search based on textual descriptions - It does not requiretraining images for the target suspect.

Robustness: attribute detectors are trained using lots of trainingimages covering different lighting conditions, pose variation, etc.

Works well in low-resolution imagery (typical in video surveillancescenarios)



Modeling attribute correlations

[Siddiquie, Feris and Davis, “Image Ranking and Retrieval Based onMulti-Attribute Queries”, CVPR 2011]



Attribute-Based Classification


Recognition of Unseen Classes (Zero-Shot Learning)[Lampert et al, Learning To Detect Unseen Object Classes by Between-Class AttributeTransfer, CVPR 2009]

1) Train semantic attribute classifiers

2) Obtain a classifier for an unseenobject (no training samples) by justspecifying which attributes it has


Attribute-based Classification


Unseen categories

Semantic AttributeClassifiers

Attribute-basedclassification

Unseen categories

Flat multi-class classification


Face verification [Kumar et al, ICCV 2009]

Bird Categorization [Farrell et al, ICCV 2011]

Animal Recognition[Lampert et al, CVPR 2009] Person Re-identification

[Layne et al, BMVC 2012]

Many more! Significantgrowth in the past few years


Attribute-based ClassificationAction recognition [Liu al, CVPR2011]

Note: Several recent methods use the term “attributes” torefer to non-semantic model outputs

In this case attributes are just mid-level features, like PCA,hidden layers in neural nets, … (non-interpretable splits)




http://rogerioferis.com/VisualRecognitionAndSearch/Resources.html













































Attributes for Fine-GrainedCategorization


Fine-Grained Categorization






Machines collaborating with humans to organize visual knowledge, connecting text to

images, images to text, and images to images

Easy annotation interface for experts (powered by computer vision)

Visual Query: Fine-grained Bird Categorization

Picture credit: Serge Belongie


Fine-Grained CategorizationVisipedia (http://http://visipedia.org /)

http://http/visipedia.org/






















African Is it an African or Indian Elephant? Indian

Example-based Fine-Grained Categorization is Hard!!

Slide Credit: Christoph Lampert



African Is it an African or Indian Elephant? Indian

Visual distinction of subordinate categories may be quite subtle, usuallybased on Parts and Attributes


Larger Ears Smaller Ears


Standard classification methods may not be suitable because thevariation between classes is small …

[B. Yao, CVPR 2012]

Codebook

… and intra-class variation is still high.



Humans rely on field guides!

Field guides usually refer to parts and attributes of the object

Slide Credit: Pietro Perona




Fine-Grained Categorization[Branson et al, Visual Recognition with Humans in the Loop, ECCV 2010]

[Branson et al, Visual Recognition with Humans in the Loop, ECCV 2010]

Computer vision reduces the amount of human-interaction (minimizes thenumber of questions)



[Wah et al, Multiclass Recognition and Part Localization with Humans inthe Loop, ICCV 2011]

Localized part and attribute detectors.

Questions include asking the user to localize parts.



http://www.vision.caltech.edu/visipedia/CUB-200-2011.html























Video Demo:http://www.youtube.com/watch?v=_ReKVqnDXzA



http://www.youtube.com/watch
















Like a normal field guide…

thatand

you can search and sortwith visual recognition

See N. Kumar et al,"Leafsnap: A Computer Vision System for Automatic Plant Species Identification, ECCV 2012

Nearly 1 million downloads

1.7

40k new users per month100k active users

million images taken100k new images/month100k users with > 5 images

Users from all over the world

Botanists, educators, kids, hobbyists,photographers, … Slide Credit: Neeraj Kumar


Check the fine-grained visual categorization workshop:http://www.fgvc.org/


http://www.fgvc.org/














Relative Attributes


[Parikh & Grauman, Relative Attributes, ICCV 2011]

Smiling ??? Not smiling

???Natural Not natural

Slide credit: Parikh &Grauman


Relative Attributes

For each attribute e.g., “openness”

Supervision consists of:

Ordered pairs

Similar pairs



Learning Relative Attributes

Learn a ranking functionImage features

Learned parameters

that best satisfies the constraints:




Max-margin learning to rank formulation

21

43

6 5

Based on [Joachims 2002]

Rank Margin

Image Relative Attribute Score




Each image is converted into a vector of relative attribute scores indicatingthe strength of each attribute

A Gaussian distribution for each category is built in the relative attributespace. The distribution of unseen categories is estimated based on thespecified constraints and the distributions of seen categories

Max-likelihood is then used for classification

Blue: Seen class Green: Unseen class


Relative Zero-Shot Learning



Relative Image Description

Slide credit: Kristen Grauman


Whittle Search


http://pub.ist.ac.at/~chl/PnA2012/

http://rogerioferis.com/PartsAndAttributes/

















































Describing images of unknown objects [Farhadi et al, CVPR 2009]

Recognizing unseen classes [Lampert et al, CVPR 2009]

Reducing dataset bias (trained across classes)

Effective object search in surveillance videos [Vaquero et al, WACV 2009]

Compact descriptors / Efficient image retrieval [Douze et al, CVPR 2011]

Fine-grained object categorization [Wah et al, ICCV 2011]

Face verification [Kumar et al, 2009], Action recognition [Liu et al, CVPR2011], Person re-identification [Layne et al, BMVC 2012] and otherclassification tasks.

Other applications, such as sentence generation from images [Kulkarni etal, CVPR 2011], image aesthetics prediction [Dhar et al CVPR 2011], …


SummarySemantic attribute classifiers can be useful for:

Extensive annotation may be required for attribute classifiers

Class-attribute relations may be automatically extracted from textualsources

[Rohrbach et al, What Helps Where – And Why? Semantic Relatedness for Knowledge Transfer", CVPR 2010]; [Berg et al, Automatic Attribute Discovery and Characterization from Noisy Web Data, ECCV 2008].

Semantic Attributes may not be discriminative

Various methods combine semantic attributes with “discriminative attributes”(non-semantic) for classification (e.g., [Farhadi et al, CVPR 2009]). Constructionof nameable + discriminative attributes has also been proposed by [Parikh &Grauman, Interactively Building a Discriminative Vocabulary of NameableAttributes, CVPR 2011]


Summary

Documents

Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio F e ris, F e b 21, 2013