Jitendra Malik University of California at...

Visual Recognition:

Prospects for Image & Video Analytics

Jitendra Malik

University of California at Berkeley

Computer Vision Group

UC Berkeley

Classification & Segmentation

Tiger Grass

outdoor

wildlife

shadow

PASCAL Visual Object Challenge

We want to locate the object

Orig. Image Segmentation Orig. Image Segmentation

UC Berkeley

Fifty years of computer vision 1963-2013

• 1960s: Beginnings in artificial intelligence, image processing and pattern recognition

• 1970s: Foundational work on image formation: Horn, Koenderink, Longuet-Higgins …

• 1980s: Vision as applied mathematics: geometry, multi-scale analysis, probabilistic modeling, control theory, optimization

• 1990s: Geometric analysis largely completed, vision meets graphics, statistical learning approaches resurface

• 2000s: Significant advances in visual recognition, range of practical applications

Computer Vision Group University of California

Berkeley

Handwritten digit recognition

(MNIST,USPS)

• LeCun’s Convolutional Neural Networks variations (0.8%, 0.6% and 0.4% on MNIST)

• Tangent Distance(Simard, LeCun & Denker: 2.5% on USPS)

• Randomized Decision Trees (Amit, Geman & Wilder, 0.8%)

• K-NN based Shape context/TPS matching (Belongie, Malik & Puzicha: 0.6% on MNIST)

UC Berkeley

EZ-Gimpy Results (Mori & Malik, 2003)

• 171 of 192 images correctly identified: 92 %

canvas

Results on various images submitted to the CMU on-line face detector http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi

Face Detection Carnegie Mellon University

Multiscale sliding window

Ask this question repeatedly, varying position, scale, category…

Paradigm introduced by Rowley, Baluja & Kanade 96 for face detection Viola & Jones 01, Dalal & Triggs 05, Felzenszwalb, McAllester, Ramanan 08

UC Berkeley

Caltech-101 [Fei-Fei et al. 04]

• 102 classes, 31-300 images/class

Caltech 101 classification results

(even better by combining cues..)

PASCAL Visual Object Challenge

Trying to find stick figures is

hard (and unnecessary!)

Generalized Cylinders (Binford, Marr & Nishihara)

Geons (Biederman)

Person detection is challenging

Can we build upon the success

of faces and pedestrians?

Pattern matching

Capture patterns that are common and visually characteristic

Are these the only two common and characteristic patterns?

Rowley, Baluja, Kanade CVPR96

Viola and Jones, IJCV01

Dalal and Triggs, CVPR05

Poselets

We will train classifiers for these different visual patterns

Segmenting people

[Bourdev, Maji, Brox and Malik, ECCV10]

Best person segmentation on PASCAL 2010 dataset

“A person with

long pants”

“A man with short

hair and long sleeves”

“A man with short

hair, glasses, short

sleeves and shorts”

“A woman with long hair,

glasses and long pants”(??)

Describing people

Male or female?

Gender classifier per poselet is

much easier to train

Is male

Has long hair

Wears long pants

Wears a hat

Wears long sleeves

Wears glasses

Actions in still images …

have characteristic :

pose and appearance

interaction with objects and agents

Some discriminative poselets

Problem: Human Activity Recognition

12/20/2011 SMARTS Annual Review 2011

Mean Performance: 59.7% correct

Approach: Learn pose and appearance specific for an action

Results : Top Confusions

Low-Cost Automated Tuberculosis Diagnostics Using Mobile Microscopy Jeannette Chang1, Pablo Arbelaez1, Neil Switz2, Clay Reber2, Asa Tapley2,3

Lucian Davis3, Adithya Cattamanchi3, Daniel Fletcher2, and Jitendra Malik1

Department of Electrical Engineering and Computer Science, UC Berkeley1

Department of Bioengineering, UC Berkeley2

Medical School and San Francisco General Hospital, UC San Francisco3

Why Tuberculosis? Mortality and Treatment1

TB is second leading cause of deaths from infectious disease worldwide (after HIV/AIDS)

Highly effective antibiotic treatment

Current Diagnostics

Technicians screen microscopic images of sputum smears manually

Other methods include culture and PCR

Tremendous potential benefit from automated processing or classification

1. http://www.who.int/tb/publications/global_report/2011/gtbr11_full.pdf

2. http://www.thehindu.com/health/rx/article21138.ece

Examples of sputum smears with TB bacteria. Brightfield (top) and fluorescent (bottom) microscopy.2

Candidate TB Blob

Identification

Feature Extraction

Linear SVM Classification

Input image from

CellScope device

Each candidate TB object is

characterized by a feature vector

containing 8 Hu moment invariants

and 14 geometric/photometric

descriptors.

𝑥1⋮𝑥𝑁

Candidate TB objects sorted by their

SVM output confidence scores in

decreasing order (row-wise, from top

to bottom)

Bar plot with SVM output confidence

scores corresponding to sorted candidate

TB objects

Array of

candidate

TB objects

0 20 40 60 80 1000

Candidate Object Index

Confidence S

Sample subset of candidate TB objects with

corresponding confidence scores

0.918 0.885 0.389 0.374 0.008 0.002 0.001 0.000

Sample positive objects Sample negative objects

Sample Candidate Objects

Patches in Descending Order of Confidence

0.000 0.200 0.400 0.600 0.800 1.000

MinIntensity

Perimeter

FilledArea

MaxIntensity

EulerNumber

Extent

ConvexArea

Solidity

MajorAxisLength

EquivDiameter

MinorAxisLength

Eccentricity

MeanIntensity

Features listed in descending order of

normalized SVM weights.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

Sensitivity (Recall)

Specific

SS/RP curves, Avg spec: 0.96744, Avg prec: 0.95389 cost exp: 7

train-SS

train-RP

test-SS

test-RP

Object-Level Performance (Uganda Data)

Slide-Level Performance (Uganda Data)

Jitendra Malik University of California at...

Documents

A Database of Human Segmented Natural Images and Two Applications David Martin, Charless Fowlkes, Doron Tal, Jitendra Malik UC Berkeley {dmartin,fowlkes,doron,malik}@eecs.berkeley.edu

1052 IEEE TRANSACTIONS ON PATTERN ANALYSIS …malik/papers/mori... · · 2006-11-02Recovering 3D Human Body Configurations Using Shape Contexts Greg Mori and Jitendra Malik,Senior

Computational Vision Jitendra Malik University of California at Berkeley Jitendra Malik University of California at Berkeley

Professor Jitendra Malik - People @ EECS at UC …people.eecs.berkeley.edu/~malik/malik-cv-full.pdfChetan Nandakumar, Invariance in Human Visual Perception,2011 (Unifyre) Subhransu

Learning Category-Specific Mesh Reconstruction from Image ... · Angjoo Kanazawa ∗, Shubham Tulsiani , Alexei A. Efros, Jitendra Malik University of California, Berkeley {kanazawa,shubhtuls,efros,malik}@eecs.berkeley.edu

Normalized cuts and image segmentation - Pattern Analysis ...people.eecs.berkeley.edu/~malik/papers/SM-ncut.pdf · Normalized Cuts and Image Segmentation Jianbo Shi and Jitendra Malik,

Jitendra!Malik!–!EECSDepartment!Chair! Jan!Rabaey!–!EE ... · Prof.&Malik&is&Arthur&J.&Chick&Professor&in&theDepartment&of&Electrical&Engineering&and& ... “Deep!Learning!for!Robotics

Contour and Texture Analysis for Image Segmentationpeople.eecs.berkeley.edu/~malik/papers/MalikBLS.pdf · Contour and Texture Analysis for Image Segmentation JITENDRA MALIK, SERGE

Visual Grouping and Recognition Jitendra Malik University of California at Berkeley Jitendra Malik University of California at Berkeley

Finding and exploiting correspondences in Drosophila embryos Charless Fowlkes and Jitendra Malik UC Berkeley Computer Science

Visual Recognition: The Big Picture Jitendra Malik University of California at Berkeley

NEA-IndoUS Ventures The spirit of Entrepreneurship

The Three R’s of Vision Jitendra Malik. The Three R’s of Vision

Detection, Segmentation and Fine-grained Localization Bharath Hariharan, Pablo Arbeláez, Ross Girshick and Jitendra Malik UC Berkeley

Segmentation using eigenvectors Papers: “Normalized Cuts and Image Segmentation”. Jianbo Shi and Jitendra Malik, IEEE, 2000 “Segmentation using eigenvectors:

Radiometry of Image Formation Jitendra Malik. A camera creates an image The image I(x,y) measures how…

Fundamentals of Image Formation Lecture 2 Jitendra Malik

Normalized Cuts and Image Segmen b o Shi and Jitendra Malik · b o Shi and Jitendra Malik Abstract W e prop ose a no v el approac h for solving the p erceptual grouping problem in

Front-end computations in human vision Jitendra Malik U.C. Berkeley References: DeValois & DeValois,Hubel, Palmer, Spillman &Werner, Wandell Jitendra Malik

Classification using intersection kernel SVMs is efficient Joint work with Subhransu Maji and Alex Berg (CVPR08) Jitendra Malik UC Berkeley