41
EIGENSPACE-BASED FALL DETECTION AND ACTIVITY RECOGNITION FROM MOTION TEMPLATES AND MACHINE LEARNING Iván Gómez Conde Ourense – January 14, 201 Meeting MILE group

MVFI Meeting (January 14th, 2011)

Embed Size (px)

Citation preview

1. Ivn Gmez Conde 2. 1. Introduction 2. Related Work 3. Theory and Algorithms MILE Dataset Motion Vector Flow Instances (MVFI) Mathematics 4. Results 5. Conclusions 6. Further Research 3. Automatically determining human actions and gestures from videos or real-time cameras. New spation-temporal representation: Motion Vector Flow Instance (MVFI) We compare it with other 2 motion templates: silhoutte and Motion History Instance (MHI) Representations with a canonical transformation with PCA and LDA. 4. Automatically determining human actions and gestures from videos or real-time cameras. New spation-temporal representation: Motion Vector Flow Instance (MVFI) We compare it with other 2 motion templates: silhoutte and Motion History Template (MHI) Representations with a canonical transformation with PCA and LDA. 5. Database of video scenes. 6 human actions 5 video sequences for each action 12 different human subjects Sampling rate (25 frames/second) Static camera Videos were saved in AVI MPEG 6. Walking Exaggerated Walking Jogging Bending over Lying Down Falling 7. Exagerated Walking 8. Bending over 9. Falling 10. 394 video sequences, 6 human actions, 12 different human subjects 11. C++ OpenCV (Open Source Computer Vision) Python FFmpeg Octave Ubuntu 12. Training Testing 13. M1: Silhouttes (foreground object) 14. M2: Motion History Instance (MHI) 15. M3: Motion Vector Flow Instances (MVFI) 16. Video sequences with 3 types of motion templates: m1, m2, m3 17. https://www.youtube.com/watch?v=sWY_mQ2_Gco 18. Training set: where is the image pertaining to the ith class and having the jth frame within the sequence. 19. The PCA canonical space is constructed from the orthogonal vectors that possess the most variance between all the pixels from the image sequence. Thus, with , we have: 20. This method projects the original images of the sequences onto the new multidimensional space: 21. We apply a Fisher criteria that maximizes the between class variance and minimizes the within class variance The Fisher linear discriminant function, , if given by the ratio: 22. We can write the corresponding eigenvalue equation: The new orthogonal basis that takes the points of the PCA space and transforms them to this new space, we call the LDA space, through: 23. The points of PCA space are projected onto the new multidimensional space: 24. In this paper we performed an N-fold cross- validation training process, where we constructed all posible binary and multiclass combinations in our dataset. Training: Testing: 25. Two methods are used for determining the class of an unknown test sequence: KNN clasifier SVM clasifier 26. Summary of some representative training statistics, showing the average number of images, the average training times, and total run times. 27. Normalized histograms of recognition rates obtained from the three different spatio-temporal motion templates from 28. Comparison of recognition rates for different multiclass training: number of actions as a function of different number of persons included in training. 29. Comparison of recognition rates for different motion templates as a function of incrementally including more people in the training set. 30. Comparison of motion templates for binary classification of actions. 31. The results of PCA and LDA training space with six different action classes of this study. 32. We performed a 10-fold cross validation for all six action classes and all the training sequences of our database. The Motion Vector Flow Instance outperforms all other motion templates. 33. This paper has compared two different motion templates with a new spatio-temporal motion template, MVFI. MVFI outperforms other methods for detecting actions characterized by large velocities. This work suggest that it is important to preserve velocity information in each image sequence. MVFI works well in all situations for action recognition: different people, different clothing types Future studies shall consider both different camer angles and distances. 34. http://www.youtube.com/watch?v=bNBMRNjIu_g&feature=player_embedded