【ECCV 2016 BNMW】Human Action Recognition without Human

Human Action Recognition without Human

He Yun1,2, Soma Shirakabe1,2, Yutaka Satoh1,2, Hirokatsu Kataoka1

1Computer Vision Research Group, AIST, Japan 2Human-Centered Vision Lab., University of Tsukuba, Japan

Motion representation

•  Database: UCF101, HMDB51, ActivityNet

•  Approach: IDT, Two-Stream CNN

–  DBs and approaches have been prepared in the field

Action Database

h"p://www.thumos.info/

The problem setting in action recognition

•  Video-level prediction

–  1 action-label prediction per input video

TennisSwing

Mo6onDescriptor

Dense Trajectories (DT) [Wang+, CVPR11]

•  Trajectory-based representation

–  A large amount of trajectories

–  Feature description (HOG, HOF, MBH)

–  Codeword vector is generated

Two-Stream CNN [Simonyan+, NIPS14]

•  Spatial and temporal convolution

–  Spatial-stream: From a RGB image

–  Temporal-stream: From a stacked flows

–  Score fusion: Average or SVM

Is background enough to classify actions?

•  RGB input is too strong!

–  The two-stream CNN[Simonyan+, NIPS14] reported spatial-stream can understand an

action more than expected

•  72.4% with spatial-stream (RGB) @UCF101

•  “Human Action Recognition without Human”

Without Human?

•  Human action recognition can be done just by motion of the

background?

TennisSwing

Mo6onDescriptor

TennisSwing?

Mo6onDescriptor

Detailed setting of w/ and w/o Human

•  With and without human setting

–  Without human setting: center-blind image with UCF101

–  With human setting: inverse of the without human setting

I(x,y) f(x,y)* I’(x,y)

1/2 1/41/4

I(x,y) f(x,y)* I’(x,y)

1/2 1/41/4

1/4ーー

WithoutHumanSeIng WithHumanSeIng

Framework –  Baseline: Very deep two-stream CNN [Wang+, arXiv15]

–  Two different scenarios: without human and with human

Exploration experiment

•  @UCF101

–  UCF101 pre-trained model with very deep two-stream CNN

–  With/Without Human Setting

Visual results (Full Image)

Visual results (Without Human Setting)

Without Human

•  The concept of ”Human Action Recognition without Human”

–  The accuracies are very close

•  With human is +9.49% better than without human

–  The current motion representation heavily rely on the backgrounds

Future work

•  This is a suggestive reality

–  We must accept this reality to realize better motion representation

–  Pure motion representation is an urgent work!

•  More sophisticated approach

•  Human only motion

【ECCV 2016 BNMW】Human Action Recognition without Human

Science

ECCV Describing Clothing by Semantic Attributeschenlab.ece.cornell.edu/.../ECCV2012_ClothingAttributes.pdf · Describing Clothing by Semantic Attributes Anonymous ECCV submission

ECCV 2012 Conference opening slides

Kinect, wp8 & human recognition

Robust Instance Recognition in Presence of Occlusion and ...mi.eng.cam.ac.uk/~cipolla/publications/inproceedings/2014-ECCV-3D... · Robust Instance Recognition in Presence of

Project Report Human Action Recognition

Post-mortem Human Iris Recognition

Human Action Recognition based on 3D Convolution Neural ... · object recognition, and 3D action recognition. The focus of this work is improving the human action recognition from

Human emotion recognition

Human Memory Chapter 9: Recognition

ECCV WS 2012 (Frank)

Second’Internaonal’Workshop’on’Parts’and’A5ributes ECCV

Marginal Loss for Deep Face Recognition IEEE 2017 Conference … · Wen Y, Zhang K, Li Z, et al. A discriminative feature learning approach for deep face recognition. ECCV, 2016 Tadmor

Human Recognition in Video

Human Face Recognition

Human Action Recognition A Grand Challenge - iceis.org · Human Action Recognition A Grand Challenge ... • Robot/Human interaction ... A hierarchical framework for recognizing human

SpeM: Modeling Human Speech Recognition - MRC ... · Web viewKeywords: human speech recognition; automatic speech recognition; spoken word recognition; computational modeling Abstract

2014 eccv stm

Tutorials & Workshops - ECCV 2012

Visual Recognition with Humans in the Loop 1vision.ucsd.edu/tmp/visipedia/eccv2010_20q_submit.pdf · 1 Visual Recognition with Humans in the Loop 1 2 Anonymous ECCV submission 2 3

HUMAN ACTIVITY TRACKING AND RECOGNITION