Download pdf - Fcv dataset zisserman_1

The Promise and Perils of Benchmark Datasets and Challenges

David Forsyth, Alyosha Efros, Fei-Fei Li, Antonio Torralba and Andrew Zisserman

So many datasets …• Cover many areas of Computer Vision

• Tremendous growth both in number of datasets and size of datasets over the last decade

• Datasets drive and enable research and success

• The Tyranny of datasets

UIUC2002

Caltech‐42003

1970 1990 2000 2010

time

Numberof categories

1

4

20

COIL‐201996.

101

all

PASCAL2007

80 millionimages

Feret

Caltech 101

Middlebury Stereo Datasets

Berkeley Segmentation Data Set 500

Large scale instance retrievalOxford Buildings Dataset INRIA Holidays Dataset

The Indoor Scene Dataset

• 67 indoor categories

• 15620 images

• At least 100 images per category

• Training 67 x 80 images

• Testing 67 x 20 images

• A. Quattoni, and A.Torralba. Recognizing Indoor Scenes. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009.

Caltech Pedestrian Dataset

• 350,000 labeled pedestrian bounding boxes

• 250,000 frames

Fine grained visual categorization

Caltech-UCSD birds 200The Oxford Flowers 102

Material recognition

Exploring Features in a Bayesian Framework for Material RecognitionCe Liu, Lavanya Sharan, Edward H. Adelson, and Ruth RosenholtzCVPR 2010

Person layout

Oxford Buffy Stickmen276 frames x 6 = 1656 body parts (sticks)

PASCAL VOC “Person Layout”

Berkeley H3DETHZ Pascal stickmen set

549 images x6 = 3294 body parts (sticks)

Human action recognitionHollywood2 dataset

Goals of this session• Tease out what it is about datasets that makes them

useful

• Recommendations on how to move forwards in designing and selecting new datasets

Program• Three examples of successful datasets and challenges

1. LabelMe, 80M tiny – Antonio 2. PASCAL – AZ3. ImageNet – Fei Fei

• Perils & Promise – Alyosha & Antonio

• Promise – David Forsyth

• Discussion