25
Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Embed Size (px)

Citation preview

Page 1: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Structural Human Action Recognition from Still Images

Moin Nabi

Computer Vision Lab.

©IPM - Oct. 2010

Page 2: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Problem Definition

How can we recognize human action from a single Image?

?

Page 3: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Problem Definition

Pose as a Latent Valiable

Page 4: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Application

• News/sports image retrieval and analysis• An important cue for video-based action

recognition

Page 5: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Previous Works• Global template-based representation

HOG by Dalal and Triggs. And , Ikizler-Cinbis et al. ICCV09

5

Action Label

Action Label

• Pose estimation -> action recognitione.g. Ramanan and Forsyth NIPS03, Ferrari et al. CVPR09

Page 6: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Our Work

• Examplar based representation

Using Poselet as a new definition of a part

6

• Pose estimation + action recognition

Page 7: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Discriminative Pose

Golfing?

Walking?

• All elements of pose are not equally important• Develop integrated learning framework to

estimate pose for action recognition

Page 8: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Pose Representation

8

• We use a coarse non-parametric pose representation– An action-specific variant of the poselet[Bourdev&Malik ICCV09]

• A poselet is a set of patches not only with similar pose configuration, but also from the same action class.

Page 9: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Poselets

• Poselets obtained by clustering ground-truth joint positions of body parts for each action

Page 10: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

PoseletsVisualization of the Poselets for Running images

For Every Action Class:

1. Devide annotation to 4 parts2. Cluster on normalized x,y3. Remove small clusters4. Crop that part of image

Learn SVM for every Poselet with HoG+: From that action -:same part from other action

5 (Actions) x 4 (Parts) x 5 (Clusters) = 100 – 10 (Remove) = 90 = 26 (leg) + 20 (L-arm) + 20 (R-arm) + 24 (Upper body)

Page 11: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Model Formation

⌂ Using Pictorial Structure Model of Pedro Felzenswalb

Training: Test:

Page 12: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Model Formation• Develop a scoring function– Should have high score for correct action label– Low score for other action labels– Model parameters

Page 13: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Model Formation

13

Pose

Action Label

Image

Choose best pose L

Page 14: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Model Formation

14

Pose

Action Label

Image

Running

Large score for

Page 15: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Model Formation

15

Pose

Action Label

Image

Sitting

Small score for

Page 16: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Model Formulation

Pairwise Relation

Part Appearance

Page 17: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

17

Part Appearance Potential

Pose

Action Label

Image

Poselet matching

Page 18: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

18

Pairwise Potential

Pose

Action Label

Image

Relative body part locations

Page 19: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

19

Full Model

Pose

Action Label

Image

Model parameters learned using max-margin

Page 20: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Learning and Inference

Page 21: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Latent SVM

We Should Minimize Loss Function !

Page 22: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Latent SVM

Page 23: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Results

23

• Still image action dataset (Internet Image)

– Five action categories– 2458 images total– Train using 1/3 of images from each category

Page 24: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Visualization of latent pose

24

Successful classification examples

Unsuccessful classification examples

Page 25: Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct. 2010

Any question

?