50
Class 5: Attributes and Semantic Features Rogerio Feris, Feb 21, 2013 EECS 6890 – Topics in Information Processing

Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio F e ris, F e b 21, 2013

  • Upload
    yale

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio F e ris, F e b 21, 2013 E E CS 6890 – T o p ics in I n f orm a tion P r o ce ss i n g Spr i ng 2013, Co l um b i a Un i v e r sity h t tp:// r o g er i o f er i s . c om/Visual R e c og n i t i onA n dS e a r c h. - PowerPoint PPT Presentation

Citation preview

Page 1: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Class 5: Attributes and Semantic Features

Rogerio Feris, Feb 21, 2013EECS 6890 – Topics in Information Processing

Spring 2013, Columbia Universityhttp://rogerioferis.com/VisualRecognitionAndSearch

Page 2: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Thanks for sending the project proposals!

Project update presentations (10 min per group)

March 14

April 11

Details will be provided in the course website

Visual Recognition And Search Columbia University, Spring 2013

Project Report

Page 3: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Plan for Today

Introduction to Semantic Features

Attribute-based Classification and

Search Attributes for Fine-

Grained Classification Relative

Attributes

Project Proposal Presentations

Visual Recognition And Search Columbia University, Spring 2013

Page 4: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Use the scores of semantic classifiers as high-level features

Off-the-shelfClassifiers …

Score Score Score

Compact / powerfuldescriptor with semanticmeaning (allows“explaining” the decision)

Semantic Features

Beach Classifier

Visual Recognition And Search Columbia University, Spring 2013

Water ClassifierSand ClassifierSky Classifier

Input Image

Semantic Features

Page 5: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

this concept

[John Smith et al, Multimedia Semantic Indexing Using Model Vectors,ICME 2003]

Concatenation / Dimensionality Reduction

Visual Recognition And Search Columbia University, Spring 2013

Semantic Features (Frame-Level) Illustration of Early IBM work (multimedia community) describing

Page 6: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

System evolved to the IBM Multimedia Analysis and RetrievalSystem (IMARS)

[Rong Yan et al, Model-Shared SubspaceBoosting for Multi-label Classification, KDD 2007]Discriminative semantic basis

Rapid event modeling,e.g., “accident with high-speed skidding”

Visual Recognition And Search Columbia University, Spring 2013

Ensemble Learning

Semantic Features (Frame-level)

Page 7: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Descriptor is formed by concatenating the outputs of weaklytrained classifiers called classemes (trained with noisy labels)

[L. Torresani et al, Efficient Object Category Recognition Using Classemes, ECCV 2010]

Images used to train the “table” classeme (from Google image search)

NoisyLabels

Visual Recognition And Search Columbia University, Spring 2013

Classemes (Frame-level)

Page 8: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Compact and Efficient Descriptor , useful for large-scale classification

Features are not really semantic!

Visual Recognition And Search Columbia University, Spring 2013

Classemes (Frame-level)

Page 10: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

DescribingNaming

Bald

Beard

Red Shirt

Modifiers rather than (or in addition to) nouns

Semantic properties that are shared among objects

Attributes are category independent and transferrable

Visual Recognition And Search Columbia University, Spring 2013

?

Semantic Attributes

Page 11: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Attribute-Based Search

Visual Recognition And Search Columbia University, Spring 2013

Page 12: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Traditional Approaches: Face Recognition (“Naming”)

Face recognition is very challenging under lighting changes, pose variation, and low-resolution imagery (typical conditions in surveillance scenarios)

Attribute-based People Search (“Describing”)

[Vaquero et al, Attribute-based People Search in Surveillance Environments, WACV 2009]

Rather than relying on face recognition only, a complementary people searchframework based on semantic attributes is provided

Query Example:

“Show me all bald people at the 42nd street station last month with dark skin, wearingsunglasses, wearing a red jacket”

Visual Recognition And Search Columbia University, Spring 2013

People Search in Surveillance Videos

Page 13: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Visual Recognition And Search Columbia University, Spring 2013

People Search in Surveillance Videos

Page 14: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Visual Recognition And Search Columbia University, Spring 2013

People Search in Surveillance Videos

Page 15: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

People Search based on textual descriptions - It does not requiretraining images for the target suspect.

Robustness: attribute detectors are trained using lots of trainingimages covering different lighting conditions, pose variation, etc.

Works well in low-resolution imagery (typical in video surveillancescenarios)

Visual Recognition And Search Columbia University, Spring 2013

People Search in Surveillance Videos

Page 16: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Modeling attribute correlations

[Siddiquie, Feris and Davis, “Image Ranking and Retrieval Based onMulti-Attribute Queries”, CVPR 2011]

Visual Recognition And Search Columbia University, Spring 2013

People Search in Surveillance Videos

Page 17: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Attribute-Based Classification

Visual Recognition And Search Columbia University, Spring 2013

Page 18: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Recognition of Unseen Classes (Zero-Shot Learning)[Lampert et al, Learning To Detect Unseen Object Classes by Between-Class AttributeTransfer, CVPR 2009]

1) Train semantic attribute classifiers

2) Obtain a classifier for an unseenobject (no training samples) by justspecifying which attributes it has

Visual Recognition And Search Columbia University, Spring 2013

Attribute-based Classification

Page 19: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Visual Recognition And Search Columbia University, Spring 2013

Unseen categories

Semantic AttributeClassifiers

Attribute-basedclassification

Unseen categories

Flat multi-class classification

Attribute-based Classification

Page 20: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Face verification [Kumar et al, ICCV 2009]

Bird Categorization [Farrell et al, ICCV 2011]

Animal Recognition[Lampert et al, CVPR 2009] Person Re-identification

[Layne et al, BMVC 2012]

Many more! Significantgrowth in the past few years

Visual Recognition And Search Columbia University, Spring 2013

Attribute-based ClassificationAction recognition [Liu al, CVPR2011]

Page 21: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Note: Several recent methods use the term “attributes” torefer to non-semantic model outputs

In this case attributes are just mid-level features, like PCA,hidden layers in neural nets, … (non-interpretable splits)

Visual Recognition And Search Columbia University, Spring 2013

Attribute-based Classification

Page 22: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Visual Recognition And Search Columbia University, Spring 2013

http://rogerioferis.com/VisualRecognitionAndSearch/Resources.html

Attribute-based Classification

Page 23: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Attributes for Fine-GrainedCategorization

Visual Recognition And Search Columbia University, Spring 2013

Page 24: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Fine-Grained Categorization

Visual Recognition And Search Columbia University, Spring 2013

Page 25: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Fine-Grained Categorization

Visual Recognition And Search Columbia University, Spring 2013

Page 26: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Fine-Grained Categorization

Visual Recognition And Search Columbia University, Spring 2013

Page 27: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Machines collaborating with humans to organize visual knowledge, connecting text to

images, images to text, and images to images

Easy annotation interface for experts (powered by computer vision)

Visual Query: Fine-grained Bird Categorization

Picture credit: Serge Belongie

Visual Recognition And Search Columbia University, Spring 2013

Fine-Grained CategorizationVisipedia (http://http://visipedia.org /)

Page 28: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

African Is it an African or Indian Elephant? Indian

Example-based Fine-Grained Categorization is Hard!!

Slide Credit: Christoph Lampert

Visual Recognition And Search Columbia University, Spring 2013

Fine-Grained Categorization

Page 29: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

African Is it an African or Indian Elephant? Indian

Visual distinction of subordinate categories may be quite subtle, usuallybased on Parts and Attributes

Visual Recognition And Search Columbia University, Spring 2013

Larger Ears Smaller Ears

Fine-Grained Categorization

Page 30: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Standard classification methods may not be suitable because thevariation between classes is small …

[B. Yao, CVPR 2012]

Codebook

… and intra-class variation is still high.

Visual Recognition And Search Columbia University, Spring 2013

Fine-Grained Categorization

Page 31: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Humans rely on field guides!

Field guides usually refer to parts and attributes of the object

Slide Credit: Pietro Perona

Visual Recognition And Search Columbia University, Spring 2013

Fine-Grained Categorization

Page 32: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Visual Recognition And Search Columbia University, Spring 2013

Fine-Grained Categorization[Branson et al, Visual Recognition with Humans in the Loop, ECCV 2010]

Page 33: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

[Branson et al, Visual Recognition with Humans in the Loop, ECCV 2010]

Computer vision reduces the amount of human-interaction (minimizes thenumber of questions)

Visual Recognition And Search Columbia University, Spring 2013

Fine-Grained Categorization

Page 34: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

[Wah et al, Multiclass Recognition and Part Localization with Humans inthe Loop, ICCV 2011]

Localized part and attribute detectors.

Questions include asking the user to localize parts.

Visual Recognition And Search Columbia University, Spring 2013

Fine-Grained Categorization

Page 37: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Like a normal field guide…

thatand

you can search and sortwith visual recognition

See N. Kumar et al,"Leafsnap: A Computer Vision System for Automatic Plant Species Identification, ECCV 2012

Page 38: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Nearly 1 million downloads

1.7

40k new users per month100k active users

million images taken100k new images/month100k users with > 5 images

Users from all over the world

Botanists, educators, kids, hobbyists,photographers, … Slide Credit: Neeraj Kumar

Page 39: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Visual Recognition And Search Columbia University, Spring 2013

Check the fine-grained visual categorization workshop:http://www.fgvc.org/

Fine-Grained Categorization

Page 40: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Relative Attributes

Visual Recognition And Search Columbia University, Spring 2013

Page 41: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

[Parikh & Grauman, Relative Attributes, ICCV 2011]

Smiling ??? Not smiling

???Natural Not natural

Slide credit: Parikh &Grauman

Visual Recognition And Search Columbia University, Spring 2013

Relative Attributes

Page 42: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

For each attribute e.g., “openness”

Supervision consists of:

Ordered pairs

Similar pairs

Slide credit: Parikh &Grauman

Visual Recognition And Search Columbia University, Spring 2013

Learning Relative Attributes

Page 43: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Learn a ranking functionImage features

Learned parameters

that best satisfies the constraints:

Slide credit: Parikh &Grauman

Visual Recognition And Search Columbia University, Spring 2013

Learning Relative Attributes

Page 44: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Max-margin learning to rank formulation

21

43

6 5

Based on [Joachims 2002]

Rank Margin

Image Relative Attribute Score

Slide credit: Parikh &Grauman

Visual Recognition And Search Columbia University, Spring 2013

Learning Relative Attributes

Page 45: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Each image is converted into a vector of relative attribute scores indicatingthe strength of each attribute

A Gaussian distribution for each category is built in the relative attributespace. The distribution of unseen categories is estimated based on thespecified constraints and the distributions of seen categories

Max-likelihood is then used for classification

Blue: Seen class Green: Unseen class

Visual Recognition And Search Columbia University, Spring 2013

Relative Zero-Shot Learning

Page 46: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Slide credit: Parikh &Grauman

Visual Recognition And Search Columbia University, Spring 2013

Relative Image Description

Page 47: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Slide credit: Kristen Grauman

Visual Recognition And Search Columbia University, Spring 2013

Whittle Search

Page 48: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Visual Recognition And Search Columbia University, Spring 2013

http://pub.ist.ac.at/~chl/PnA2012/

http://rogerioferis.com/PartsAndAttributes/

Page 49: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Describing images of unknown objects [Farhadi et al, CVPR 2009]

Recognizing unseen classes [Lampert et al, CVPR 2009]

Reducing dataset bias (trained across classes)

Effective object search in surveillance videos [Vaquero et al, WACV 2009]

Compact descriptors / Efficient image retrieval [Douze et al, CVPR 2011]

Fine-grained object categorization [Wah et al, ICCV 2011]

Face verification [Kumar et al, 2009], Action recognition [Liu et al, CVPR2011], Person re-identification [Layne et al, BMVC 2012] and otherclassification tasks.

Other applications, such as sentence generation from images [Kulkarni etal, CVPR 2011], image aesthetics prediction [Dhar et al CVPR 2011], …

Visual Recognition And Search Columbia University, Spring 2013

SummarySemantic attribute classifiers can be useful for:

Page 50: Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio  F e ris,  F e b 21, 2013

Extensive annotation may be required for attribute classifiers

Class-attribute relations may be automatically extracted from textualsources

[Rohrbach et al, What Helps Where – And Why? Semantic Relatedness for Knowledge Transfer", CVPR 2010]; [Berg et al, Automatic Attribute Discovery and Characterization from Noisy Web Data, ECCV 2008].

Semantic Attributes may not be discriminative

Various methods combine semantic attributes with “discriminative attributes”(non-semantic) for classification (e.g., [Farhadi et al, CVPR 2009]). Constructionof nameable + discriminative attributes has also been proposed by [Parikh &Grauman, Interactively Building a Discriminative Vocabulary of NameableAttributes, CVPR 2011]

Visual Recognition And Search Columbia University, Spring 2013

Summary