Upload
aditi-roy
View
220
Download
5
Embed Size (px)
Citation preview
Contents lists available at SciVerse ScienceDirect
Signal Processing
Signal Processing 92 (2012) 780–792
0165-16
doi:10.1
n Corr
E-m
shamik
jay@cse
journal homepage: www.elsevier.com/locate/sigpro
Gait recognition using Pose Kinematics and Pose Energy Image
Aditi Roy a,n, Shamik Sural a, Jayanta Mukherjee b
a School of Information Technology, Indian Institute of Technology Kharagpur, Indiab Department of CSE, Indian Institute of Technology Kharagpur, India
a r t i c l e i n f o
Article history:
Received 21 March 2011
Received in revised form
30 June 2011
Accepted 22 September 2011Available online 29 September 2011
Keywords:
Gait recognition
Pose Kinematics
Pose Energy Image
Dynamic programming
Gait Energy Image
84/$ - see front matter & 2011 Elsevier B.V. A
016/j.sigpro.2011.09.022
esponding author.
ail addresses: [email protected] (A.
@cse.iitkgp.ernet.in, [email protected]
.iitkgp.ernet.in (J. Mukherjee).
a b s t r a c t
Many of the existing gait recognition approaches represent a gait cycle using a single 2D
image called Gait Energy Image (GEI) or its variants. Since these methods suffer from
lack of dynamic information, we model a gait cycle using a chain of key poses and
extract a novel feature called Pose Energy Image (PEI). PEI is the average image of all the
silhouettes in a key pose state of a gait cycle. By increasing the resolution of gait
representation, more detailed dynamic information can be captured. However, proces-
sing speed and space requirement are higher for PEI than the conventional GEI methods.
To overcome this shortcoming, another novel feature named as Pose Kinematics is
introduced, which represents the percentage of time spent in each key pose state over a
gait cycle. Although the Pose Kinematics based method is fast, its accuracy is not very
high. A hierarchical method for combining these two features is, therefore, proposed. At
first, Pose Kinematics is applied to select a set of most probable classes. Then, PEI is used
on these selected classes to get the final classification. Experimental results on CMU’s
Mobo and USF’s HumanID data set show that the proposed approach outperforms
existing approaches.
& 2011 Elsevier B.V. All rights reserved.
1. Introduction
Over the last decade, gait recognition has become apromising research topic in biometric recognition since itprovides many unique advantages such as non-contactand non-invasive compared to the traditional biometricfeatures like face, iris, and finger print. Gait recognitionrefers to verifying and/or identifying persons using theirwalking style. A gait recognition method combined withother biometric based human recognition methods holdspromise as tools in visual surveillance systems, tracking,monitoring, forensics, etc., since they provide reliable andefficient means of identity verification.
Gait recognition approaches are mainly classified intothree types [1], namely, model based approaches, appearance
ll rights reserved.
Roy),
(S. Sural),
based approaches and spatiotemporal approaches. Whilemodel based methods are generally view and scale invariant,use of such methods is still limited due to current imperfectvision techniques (e.g., tracking and localizing human bodyaccurately in 2D or 3D space has long been a challenging andunsolved problem), requirement of good quality silhouettesand high computational cost. Most of the current approachesare appearance based [3–8], which directly use the silhou-ettes of gait sequences for feature extraction without devel-oping any model. These approaches are suitable for practicalapplications since they operate on binary silhouettes whichcould be of low quality and also due to the fact that no coloror texture information is needed. While appearance basedapproaches deal with the spatial information alone, spatio-temporal approaches deal with both spatial as well astemporal domain information [11].
Among the appearance-based approaches, temporaltemplate based gait feature has obtained significant atten-tion due to its simple, robust representation and goodrecognition accuracy. This type of approaches compresses
A. Roy et al. / Signal Processing 92 (2012) 780–792 781
a gait cycle into one static image. The ability to reduce theeffect of random noise in silhouettes by averaging, makesit robust. Since human motion is represented by a single2D image, storage space and computational cost are alsoreduced. Thus, temporal template is considered to be aneffective way for gait representation. Bobick et al. [2] firstproposed Motion Energy Image (MEI) and Motion HistoryImage (MHI) for robust static representation of humanmotion. Han and Bhanu [3] later extended this idea to gaitrepresentation and developed a new feature named asGait Energy Image (GEI). GEI reflects major shapes ofsilhouettes and their changes over a gait cycle. Thus itkeeps both the static information and motion information.GEI is obtained by averaging silhouettes over a gait cycle.A higher intensity pixel in GEI indicates higher frequencyof occurrence of the corresponding body part at thatposition. For uncorrelated and identically distributednoise, GEI achieved encouraging performance in gaitrecognition. However, since GEI represents a gait sequenceby using a single image, it loses intrinsic dynamic char-acteristics and detailed information of the gait pattern.
To capture the motion information more accurately,Gait History Image (GHI) [4] was proposed. GHI preservesthe temporal information besides the spatial information.Later, Gait Moment Image (GMI) [5] was developed,which is the gait probability image at each key momentof all gait cycles. The gait images over all possible gaitcycles corresponding to a key moment are averaged as theGEI of this key moment. The number of key moments aresome dozens. However, it is not easy to select the keymoments from gait cycles with different periods. In thesearch for better representation of the dynamic informa-tion, some other methods were proposed by directlymodifying the basic GEI, namely Enhanced GEI (EGEI)[6], Frame Difference Energy Image (FDEI) [7], and ActiveEnergy Image [8]. Better performance was reported thanthe conventional GEI method.
The above-mentioned methods are constrained byfactors like clothing, carrying object, etc., and there is stillscope for improvement. This motivated us to investigatehow the dynamic information can be represented in abetter way. To capture motion information in higherresolution, we introduce a novel gait representation calledPose Energy Image (PEI). As a result of increased resolutionof gait representation, this feature based recognitionapproach becomes slow. To increase the processing speed,we propose another new feature named as Pose Kine-matics, which captures pure dynamic information withoutconsidering the silhouette shape. This second feature isused to reduce the search space of the PEI based methodby selecting most probable P classes based on onlydynamics. PEI based approach is then applied on theseselected classes to get the end result. The rest of the paperis organized as follows. Section 2 introduces the proposedapproach. Section 3 presents and analyzes the experimen-tal results, and Section 4 concludes the paper.
2. Proposed approach
The motivation behind this work is to better capturethe temporal variation in gait feature representation.
Since in original GEI representation, intrinsic dynamicinformation is not preserved properly, it is found to beless discriminative. Here we represent a gait cycle by aseries of K key poses. A Pose Energy Image (PEI), asproposed in this paper, is the average image of all thesilhouettes in a gait cycle which belong to a particularpose state. Thus for N key pose states, we get N PEIs. PEI isdesigned in a way to increase the resolution of GEI. Hence,it is able to capture detailed motion information. Theintensity variation in a PEI is low since each PEI representscentralized motion. Thus, the series of PEIs representing agait cycle is able to represent how the shape is changingin different phases of the gait cycle and the amount ofchange in each gait phase. So, increasing resolution helpsto capture minute dynamic information. But, this repre-sentation requires higher computational complexity andstorage space than the conventional GEI. As a result,processing speed gets slower. To alleviate this problem,we introduce another novel feature, named as PoseKinematics, which captures pure dynamic information. Itis defined as the percentage of time spent in each of the N
key pose states. For different subjects the percentage oftime spent in each key pose state is different, whichdetermines their unique dynamic characteristic. Thus,these two features separately capture two key compo-nents of gait feature, namely shape and dynamics. Proces-sing of Pose Kinematics can be done very fast, but thediscriminative power of dynamics based feature alone isfound to be not as effective. Shape plays an important rolein gait feature representation as has been reported in theliterature [1]. Keeping this in mind, we combine PoseKinematics and PEI in a way such that both accuracy andefficiency are high.
We present Algorithm 1 (Complete Gait RecognitionAlgorithm), which describes the overall method for gaitrecognition using Pose Kinematics and PEI. The proposedapproach can be divided into two broad parts. First is keypose estimation and silhouette classification into the keypose classes (step 1 in Algorithm 1). The second partconsists of gait feature vector computation using PK andPEI, feature space reduction and classification (steps 2 and3 in Algorithm 1). The first part is described in detail inSections 2.1 and 2.2. Fig. 1 shows the corresponding blockdiagram. In the training stage, training silhouettesequence is first projected into the eigenspace. ThenK-means clustering is applied on these transformedsilhouette images to determine the key poses. Duringtesting, the test silhouettes are first projected into theeigenspace to get the projected feature vectors. Thensimilarity values between each test silhouette and key posesare determined. Finally, dynamic programming based long-est path search method is applied to compute the key poselabel of each test silhouette. The second part is covered inSections 2.3–2.5. The flow chart of Fig. 2 shows the stepsfollowed in this stage. From the labeled training sequence,first we compute the PK and PEI features. Next, subspacemethod is applied on the PEI feature vector to reduce thefeature dimension. During test phase, from the labeled testsilhouette sequence, PK features are extracted. Then classi-fication is performed based on these features only if thesimilarity value is high enough. Otherwise, PK is used to
Compute PoseKinematics
Compute PoseEnergy Image
PCA/LDAAnalysis
Compute PoseKinematics
Compute PoseEnergy Image
Feature SpaceTransformation
Training silhouetteswith corresponding
key pose label
Test silhouetteswith corresponding
key pose label
SimilarityMeasurement
No
SimilarityMeasurement
RecognitionResult
Select a Set of MostProbable Classes
Yes
TransformationMatrix
SimilarityValue >
Threshold
Fig. 2. Flow chart of human recognition method using proposed PEI and PK features.
EigenSpace
Projection
K-meansClustering
TrainingSilhouetteSequence
TestSilhouetteSequence
MostProbable
Path Search
MatchingScore
Computation
Database
Classification ofsilhouettes into
key poses
EigenSpace
Projection
TransformationMatrix
Fig. 1. Block diagram of key pose estimation and silhouette classification into the estimated key pose classes.
A. Roy et al. / Signal Processing 92 (2012) 780–792782
select a set of most probable classes. Then PEI features areextracted from the test silhouettes and dimension reductionis preformed using the transformation matrix obtained fromthe training data. Finally, classification is done by comparingthe test PEI feature vector with PEI feature vectors of theselected classes.
Algorithm 1. Complete algorithm.
Step 1:
Key Pose Estimation and Silhouette Classification into Key PosesInput: Training silhouettes(I1 , . . . ,IN), eigenvector
count(K), key pose count (K), Test silhouettes(F1 , . . . ,FT )
Output: Key poses(P1 , . . . ,PK ), Silhouette Class label
Step 1.1: Apply eigenspace projection to input
silhouettes(Ii ,i 2 N) to obtain K EigenSilhouette(ui ,i 2 K)
Step 1.2: Apply K-means clustering on EigenSilhouettes
to get K key poses representing a gait cycle(P1 , . . . ,PK )
Step 1.3: Project test silhouettes(Ft ,t 2 T) to
eigenspace
Step 1.4: Compute match scores among all projected
training images and key poses P1 , . . . ,PK
Step 1.5: Apply graph based path searching to find out
key pose class labels of the test silhouette sequence
Step 2: Gait Feature Extraction and Dimension ReductionInput: Test silhouettes(F1 , . . . ,FT ) with pose class
labels, Gait cycle length (GC), key pose count (K)
Output: Pose Kinematics (PK), Projected PEI
Step 2.1: Compute ith ði 2 KÞ element of PK as
PKi ¼1
GC
XGC
t ¼ 1
1 if Ft 2 Pi
Step 2.2: Compute ithPose Energy Image (PEI) as
PEIiðx,yÞ ¼1
GCnPKi
XGC
t ¼ 1
Ft ðx,yÞ if Ft 2 Pi
Step 2.3: Apply subspace algorithms to PEI and get
projected low dimensional PEI
Step 3: Human RecognitionInput: PK and Projected PEI
Output: Most probable subject class label
Step 3.1: Apply PK based nearest neighbor classifier to
select P highest ranked classes
Step 3.2: Apply PEI based classifier on selected P
subject classes and compute the most probable class
2.1. Key pose estimation
Before computing the proposed features, it is firstrequired to define the key poses and classify the inputsilhouette sequence into these key pose classes. Since nostandard way of determining the number of key poses andtheir characteristics is available, unsupervised learning,namely constrained K-means clustering, is applied to choose
A. Roy et al. / Signal Processing 92 (2012) 780–792 783
the key poses. Instead of applying K-means clustering in thesilhouette space directly, PCA is first applied on silhouetteimage sequences. PCA computes a smaller set of orthogonalvectors which preserves the observed total variance betterthan the original feature space. Then, K-means clustering isapplied on the PCA transformed feature vectors. These stepsare described in detail in the next subsections.
2.1.1. Eigenspace projection
In this section we discuss step 1.1 of Algorithm 1 in detail.At first, eigenspace projection is applied to find the principalcomponents or the eigenvectors of the silhouette image set,which are termed as EigenSilhouettes due to their silhouettelike appearance. Since only a subset of these EigenSilhouettesis important in encoding the variation in silhouette images,we select the most significant K EigenSilhouettes.
Let, there be N training silhouette images I1,I2, . . . ,IN ofsize S¼W � H, where W is the width and H the height of asilhouette image. Each image Ii 2 I is represented as a columnvector Gi of size S� 1, where I represents the training imageset. The mean silhouette vector C is computed as follows:
C¼1
N
XN
i ¼ 1
Gi ð1Þ
We then find the covariance matrix C as follows:
C ¼1
N
XN
n ¼ 1
ðGn�CÞðGn�CÞT ¼1
N
XN
n ¼ 1
FnFTn ¼ AAT
ð2Þ
where A¼ ½F1,F2, . . . ,FN�. Computing eigenvectors ui fromC (size S� S) is intractable for typical image sizes [18]. Tohandle this problem, we first compute the eigenvectors ofmuch smaller ATA matrix of size N�N and take linearcombinations of the silhouette images Fi. Thus we get N
eigenvectors (vi,i¼ 1;2, . . . ,N) each of dimension N � 1.Now from matrix properties, we compute eigenvectors(ui,i¼ 1;2, . . . ,N) of the covariance matrix C¼AAT asui¼Avi [18]. Since the dimension of A is S�N, the dimen-sion of ui becomes S� 1. ui is normalized such thatJuiJ¼ 1. Thus, N eigenvectors of C are obtained.
Since the number of eigenvectors is still large and all ofthem do not contain significant information, we select Kmost significant eigenvectors. The eigenvalues are sortedin decreasing order and K number of eigenvectors thataccount for variance more than 90% are selected.
Once eigenvectors are computed, we find the weightvectors, also known as silhouette space image, as follows:
Oi ¼ uTFi, i¼ 1;2, . . . ,N ð3Þ
where u¼ ½u1,u2, . . . ,uK�,KrN, and silhouette spaceimage Oi ¼ ½w1,w2, . . . ,wK�
T .Now each silhouette in the training set (mean sub-
tracted), Fi can be represented as a linear combination ofthese EigenSilhouettes ui as follows:
Fi ¼XK
j ¼ 1
wjuj ð4Þ
2.1.2. K-means clustering
In this stage we apply constrained K-means clusteringto determine the key poses (each cluster represents a key
pose class). This section describes step 1.2 of Algorithm 1.The inherent sequential nature of the key poses in a gaitcycle makes the clusters formed by K-means clusteringtemporally adjacent. Let, the feature vectors of a gait cycle
of the nth subject be On¼on
1,on2, . . . ,on
p , where p is the
number of silhouettes in a gait cycle. Initially the clustersare formed by equally partitioning each gait cycle into K
segments. The jth frame is assigned to cluster
i¼ 1þbððjnKÞ=pÞc, where i 2 K . Thus, all the frames in theith segment of each gait cycle of all the subjects aregrouped under the ith cluster. Let the initial set of clusters
be S0¼ S0
1,S02, . . . ,S0
K , and the corresponding centroids are
P0 ¼ P01,P0
2, . . . ,P0K , each of which represents a key pose.
Then, constrained K-means clustering is applied for itera-tively refining the clusters. The first constraint applied isthat, the only allowable transitions are from the ith
cluster to ði�1Þth or ith or ðiþ1Þth clusters. The secondconstraint is that, after performing cluster assignment bytaking the first constraint into account, check the transi-tion order of each frame. If it is not ordered properly, thenreassign those frames such that the previous frame’scluster is lower or equal to the current frame’s cluster.Lastly, ensure that every cluster has at least one framefrom each gait cycle of each subject. After initialization,the algorithm proceeds by alternating between the fol-lowing two steps:
Update step: Calculate the centroid of the cluster.
PðtÞi ¼1
9SðtÞi 9
X
oj2SðtÞi
oj ð5Þ
Assignment step: Reassign each frame to the allowablecluster with the closest mean:
Sðtþ1Þi ¼ foj : Joj�PðtÞi JrJoj�PðtÞj J for j¼ i�1 or iþ1g ð6Þ
The algorithm terminates when the assignments nolonger change.
To decide the optimum number of key poses (repre-senting one complete gait cycle) formed by K-meansclustering, we consider the Rate-distortion curve as usedin [9]. It plots the average distortion as a function of thenumber of clusters, an example of which is shown inFig. 3. It can be observed from the plot that beyond 16clusters the average distortion does not decrease signifi-cantly. Thus, we choose 16 clusters which gives a set of 16key poses. Fig. 4 shows the 16 key poses corresponding to16 cluster centroids P12P16 over one gait cycle.
2.2. Silhouette classification
Next stage is silhouette classification into key poses.Given the input test sequence (F1, . . . ,FT ), each silhouetteis first linearly projected into the eigenspace to get theweight vectors (see step 1.3 in Algorithm 1). Let the meanweight vectors corresponding to the key poses beðP1,P2, . . . ,PK Þ. Given an unknown test silhouette Fi repre-sented as column vector G00i , we first find the mean-subtracted vector as F00i ¼G00i �C. Then, it is projectedinto the eigenspace and the weight vector is determined
A. Roy et al. / Signal Processing 92 (2012) 780–792784
as follows:
O00i ¼ uTF00i ð7Þ
After the feature vector (weight vector) for the testsilhouette is computed, the match scores of the probesilhouette to all of the key pose weight vectors ðP1, . . . ,PK Þ
are determined (see step 1.4 in Algorithm 1). To do this,we use Euclidean distance measure (MatchValðO00i ,PjÞ ¼
1�DðPj�O00i Þ) for j¼ 1;2, . . . ,K. If there are K key posesand T frames in a sequence, a ½K � T� array of match scoresis obtained. From these match scores, the input silhouettecan be classified into one of the key pose classes directlyby considering the best matched key pose. But, thissimplified approach overlooks the temporal context ofkey pose sequence. As a result, two consecutive framesmay be classified into two temporally non-adjacentclasses which is clearly wrong. The problem may becaused due to distorted silhouettes obtained from imper-fect background segmentation. When perfectly cleansilhouettes are available then also incorrect detectioncan occur due to the fact that different key poses maygenerate similar silhouettes (like left foot forward posi-tion and right foot forward position). So, to mitigate thesefactors of unreliable observations and robustly classify aninput sequence to a sequence of known key poses, wetake advantage of the temporal constraints imposed by astate transition model.
Fig. 4. Reconstructed key poses obtained from K-mean
5 10 15 20 25600
650
700
750
800
850
900
950
1000
Number of Centroids
Dis
torti
on
Fig. 3. Rate–distortion plot.
2.2.1. State transition model
In our state transition model one complete gait cycle ismodeled as a chain of estimated key poses. Thus, if thenumber of key poses forming a gait cycle is K, then thenumber of states in the transition diagram will also be K.Fig. 5 shows an example state transition diagram havingfive states in a gait cycle. One key pose is associated witheach state. Since in our case we consider 16 key poses torepresent a gait cycle, the corresponding state transitionmodel also has 16 states unlike the example model shownin Fig. 5 having five states. This state transition modelprovides contextual and temporal constraints where thelinks specify the temporal order of key poses.
We construct a directed graph from this state transi-tion model, where vertices are the key pose states andedges are the allowable state transitions. The key posefinding problem is formulated as the most likely pathfinding problem in the directed graph which is solvedusing dynamic programming.
2.2.2. Graph-based path searching
Let, the input sequence of frames be F ¼ F1, . . . ,FT and
the possible set of states in the ith frame be Si¼ Si
1, . . . ,SiK ,
where S1 to SK states represent key poses
Pi ¼ Pi1,Pi
2, . . . ,PiK . The set of vertices V of the graph G
corresponds to the key pose states Si for i¼1 to T. An edge
eikp : Si
k-Siþ1p is added to the graph G only if transition
from Sk to Sp is allowable by the state transition diagram.The graph thus constructed is a directed acyclic graph.Fig. 6 shows an example of a graph constructed from thestate transition model shown in Fig. 5 for five consecutiveframes. For each frame there are five states (S1–S5). Anedge between the nodes in frame Fi and Fiþ1 is added if
s clustering in eigenspace (CMU MoBo database).
S1 S2 S3
T51
T33
T11
T22
T12 T23S5S4
T44
T55T34 T45
Fig. 5. Proposed state transition diagram considering five states (S1–S5)
corresponding to five key poses (P1–P5).
0.320.321
0.480.801
0.190.991
0.131.121
0.171.291
0.150.152
0.090.093
0.160.164
0.280.285
0.300.621
0.070.222
0.050.214
0.100.385
0.371.171
0.290.912
0.080.303
0.070.455
0.341.512
0.301.472
0.151.063
0.051.562
0.121.632
0.391.863
0.271.334
0.080.535
F1 F2 F3 F4 F5
S1
S2
S3
S4
S5
Fig. 6. Directed acyclic graph constructed from the state transition diagram of Fig. 5 over five frames. For each frame there are five states (S1–S5). An
edge between the nodes in frame Fi and Fiþ1 is added if that transition is allowed by the state transition diagram. The first value shown in each node
represents MatchValðÞ, the second value represents PathValðÞ, and the third value represents PrevNodeðÞ. The bold edges show the longest path found by
dynamic programming. The pose assignment obtained for each frame is: S1–S2–S3–S4–S5.
A. Roy et al. / Signal Processing 92 (2012) 780–792 785
that transition is allowed by the state transition diagramin Fig. 5.
The most likely sequence of key poses for a silhouettesequence will be the longest path (the path havingmaximum weight) belonging to the set of all admissiblepaths in this graph. Dynamic programming [12] is used tofind the longest path. The complete algorithm for findingmost probable path is described in Algorithm 2.
Algorithm 2. Silhouette classification algorithm.
Parameters:T¼Frame count in a sequence
F¼frame sequence F1 ,F2 , . . . ,FT
GFðV ,EÞ¼Directed acyclic graph
K¼Number of nodes in each frame (same as the number of
states in the state transition model)
tij ¼ jth state in frame Fi
Eðtij ,tklÞ¼Edge joining the node tij with tkl
MatchValðtijÞ¼Probability of being frame Fi in jth state
PathValðtijÞ¼Weight of the path up to Fi in state j
PrevNodeðtijÞ¼State of the previous frame Fi�1 that
maximizes PathValðtijÞ
Input: GFðV ,EÞ
Output: MaximumWeightedPath from F1 to FT, BestStatei i 2 T
Initialization:
PathValðt1jÞ ¼MatchValðt1jÞ, PathValðtijÞ ¼ 0, 8i41,j
PrevNodeðtijÞ ¼ 0, 8i,j
BeginIteration:
For each state ðj 2 KÞ of each frame Fi ,i 2 T compute:
PathValðtijÞ ¼maxðPathValðtklÞþMatchValðtijÞÞ, 8 nodes in the
previous frame Fk such that k¼ i�1
PrevNodeðtijÞ ¼ argmaxðPathValðtklÞþMatchValðtijÞÞ
Termination:
MaximumWeightedPath¼maxðPathValðtTjÞÞ, 8j in frame FT
BestStateT ¼ argmaxðPathValðtTjÞÞ
Path Backtracking:
BestStatei ¼ PrevNodeðtðiþ1ÞðBeatStateiþ 1 ÞÞ, i¼ T�1,T�2, . . . ,1
End
At each time step, we compute three values for eachnode: the match score (MatchValðtijÞ in Algorithm 2(Silhouette Classification Algorithm) which is the firstvalue shown in each node of Fig. 6) between each state j
in the graph and input silhouette image of frame Fi, thebest path score value (PathValðtijÞ in Algorithm 2 which isthe second value shown in each node of Fig. 6) along apath up to current node ðtijÞ and the previous node alongthe longest path up to current node (PrevNodeðtijÞ inAlgorithm 2 which is the third value shown in each nodeof Fig. 6). The match score MatchValðO00i ,PjÞ (represented asMatchValðtijÞ in Algorithm 2) actually represents to whatextent the silhouette of the current input frame Fi
matches the key pose (Pj) corresponding to state j. Theprocedure for computing this value is described in detailin Section 2.2 (step 1.4 in Algorithm 1).
During initialization, the PathVal() is made zero for allthe frames except the first frame where PathValðÞ is madeequal to MatchValðÞ. Similarly, PrevNodeðÞ is also madezero for all the frames initially. Then during iteration, atevery frame number Fi, node tij searches all possibleprevious nodes that link to the current node in the graphand chooses the one with the maximum best path scorevalue. Then the path score value of the current node tij isupdated accordingly and the selected previous node isrecorded. When the last frame is reached, the node withthe maximum path score value is selected as the poseclass of the final frame and then backtracking is per-formed to get the longest path (shown in bold in Fig. 6).
A. Roy et al. / Signal Processing 92 (2012) 780–792786
For a fully ergodic graph, algorithmic complexity isOðK2TÞ, where K is the number of states and T is thenumber of frames. But, here, since the average in-degree ofeach node is small, the overall complexity becomes O(KT).
Thus, the output at this stage is a sequence of key poselabels representing the most probable class of each silhouetteframe in the input sequence. For example silhouette sequenceof a subject with their key pose labels is shown in Fig. 7.
2.3. Pose Kinematics and Pose Energy Image (PEI)
computation
Once the key pose class of each frame in a silhouettesequence is obtained, we compute Pose Kinematics as a K
element vector where K is the number of key poses. Theith element (PKi) of the vector represents the fraction oftime ith pose (Pi) occurred in a complete gait cycle:
PKi ¼1
GC
XGC
t ¼ 1
1 if Ft 2 Pi ð8Þ
where GC is the number of frames in the complete gaitcycle(s) of a silhouette sequence, Ft is the tth frame in thesequence (moment of time) and Pi is the ith key pose. Forexample, to compute the first component of the PK featurevector from the silhouette sequence with their key poselabels shown in Fig. 7, we first count the number ofsilhouettes that belong to key pose class 1. It is found tobe 3 (frame numbers 12–14 in Fig. 7), and the length of thegait cycle is 36. So, the first component will be 3/36 which
Fig. 7. Example silhouette sequence for a subject of one gait cycle length. The ke
programming based most probable path search is shown by the labels in the bot
key pose 13) and ends at frame no. 36 (state S12 or key pose 12). Thus the gait c
3 3 1 1 1 3 5 1 2 3].
is 0.0833. The other components of the PK feature vectorcan also be computed by following the same procedure.The complete PK feature vector for the sequence shown inFig. 7 is 0.0833, 0.0278, 0.0278, 0.0278, 0.1667, 0.0278,0.0833, 0.0833, 0.0278, 0.0278, 0.0278, 0.0833, 0.1389,0.0278, 0.0556, 0.0833}. Algorithm 3 (Feature Computa-tion Algorithm) shows the steps for PK feature computa-tion on a silhouette sequence of one gait cycle length.
Algorithm 3. Feature computation algorithm.
Input:GC¼Gait cycle length
K¼Number of key poses
F¼Silhouette sequence F1 ,F2 , . . . ,FGC
L¼Corresponding key pose label L1 ,L2 , . . . ,LGC
Output:PK¼Pose Kinematics
PEI¼Pose Energy Image
Initialization:
PKk¼0, PEIk¼0, 8k 2 K
Beginfor i¼1 to GC do
for k¼1 to K do
if Li ¼ ¼ k thenPKkþþ ;
PEIk ¼ PEIkþFi;
end ifend for
end forfor k¼1 to K do
PEIk ¼ PEIk=PKk;
PKk ¼ PKk=GC;
end forEnd
y pose class labels of the silhouette sequence obtained from the dynamic
tom of each silhouette. The gait cycle starts from frame no. 1 (state S13 or
ycle length is 36. Silhouette count for key pose classes 1–16 is [3 1 1 1 6 1
A. Roy et al. / Signal Processing 92 (2012) 780–792 787
Pose Energy Image (PEI) is used to represent shapeproperty of gait signature. One PEI is computed for each
key pose. Thus, instead of a single 2D image as done forGEI, K number of PEIs are computed from one gait cycle ofan image sequence. PEI is based on the basic assumptionsthat, (i) the order of poses in human walking cycles is thesame and (ii) differences exist in the phase of poses in awalking cycle.Given the preprocessed binary gait silhouette imageItðx,yÞ corresponding to frame Ft at time t in a sequence, ithgray-level Pose Energy Image (PEIi) is defined as follows:
PEIiðx,yÞ ¼1
GCnPKi
XGC
t ¼ 1
Itðx,yÞ if Ft 2 Pi ð9Þ
where x and y are values in the 2D image coordinate. Thedetailed steps for computing PEI are shown in Algorithm 3.Fig. 8(a) shows the PEIs corresponding to the silhouettesequence of Fig. 7 of a subject having 16 key poses. Forexample, the first PEI is obtained by taking average ofsilhouettes 12–14 of Fig. 7 which belong to key pose class1. Other PEIs are also obtained similarly. It can be observedthat intensity variation within each PEI is small since itreflects unified motion. Thus, PEI reflects major shapes ofsilhouettes and their changes over a gait cycle. Fig. 8(b)shows PEIs of another subject.
2.4. Feature dimension reduction
Since Pose Kinematics is a K-dimensional vector, whereK is much smaller than one gait cycle length, dimensionreduction is not required for it. The feature vectors of theprobe sets are classified into one of the subject classesusing nearest neighbor criteria. However, for PEI feature,
Fig. 8. PEI example images from CMU MoBo data base: (a) First two rows sho
(b) Last two rows show PEIs for another subject.
reduction of dimensionality is absolutely necessary toovercome the ‘‘curse of dimensionality’’. Many subspacealgorithms have been applied in gait recognition in recentyears, like principal component analysis (PCA), lineardiscriminant analysis (LDA), PCAþLDA [3], discriminativecommon vectors (DCV) [6], two-dimensional locality pre-serving projections (2DLPP) [8], etc. Advanced and com-plex algorithms are shown to achieve higher recognitionaccuracy with also higher computational cost. Here weapply two classical linear transformations, namely PCAand LDA, which have lower computational cost. Theprojected low dimensional PEI feature vector of a testsequence is categorized using nearest neighbor criteria asdone in Pose Kinematics. With this simple projectionmethod, the proposed features are observed to achievehigher recognition accuracy than the other existing featurerepresentation methods, thus establishing the inherentpower of our gait representation. Next we describe thetwo subspace methods used for feature learning in detail.
2.4.1. Learning features using PCA
By applying PCA, we obtain several principal compo-nents to represent the original PEI gait features from ahigh-dimensional measurement space to a low-dimen-sional eigenspace. Let, each series of K PEIs obtained froma gait cycle of a subject be represented by a column vectorxi of size d¼W � H � K , where W is the width and H theheight of a PEI and K the number of key pose classes. Thus,the d-dimensional training PEI data set of size M isx1,x2, . . . ,xM . Then, the average vector m and covariancematrix S are computed as follows:
m¼1
M
XM
k ¼ 1
xk ð10Þ
w 16 PEIs of a subject obtained from the silhouette sequence of Fig. 7.
A. Roy et al. / Signal Processing 92 (2012) 780–792788
S¼1
M
XM
i ¼ 1
ðxi�mÞðxi�mÞT ð11Þ
Next, the eigenequation Sek ¼ lkek is solved and eigen-vectors [e1,e2, . . . ,e
d0 ] corresponding to d0 large eigenva-
lues (l1Zl2Z � � �Zld0 ) are selected (d0od). Then the
d0-dimensional feature vector yk is obtained from xk asfollows:
yk ¼ ½e1,e2, . . . ,ed0 �T ðxk�mÞ ¼ TPCAðxk�mÞ, k¼ 1, . . . ,M
ð12Þ
These reduced d0-dimensional feature vectors are usedfor recognition.
2.4.2. Learning features using LDA
The second subspace method used is LDA. However,instead of pure LDA, we use PCAþLDA to address thesingularity issues. It occurs since the training data set sizeis smaller than the feature vector size. Consider thed-dimensional training PEI data set of size M to bex1,x2, . . . ,xM as before, and they belong to c classes. LDAcomputes the optimal discriminating space T by max-imizing the ratio of between-class scatter matrix SB to thewithin-class scatter matrix SW as follows:
T ¼ arg maxW
9WT SBW9
9WT SW W9ð13Þ
SW is defined as
SW ¼Xc
i ¼ 1
X
x2Di
ðx�miÞðx�miÞT
ð14Þ
where mi ¼ ð1=niÞP
x2Dix and Di is the training PEI set that
belongs to the ith class and ni is the number of PEIs in Di.The between-class scatter matrix SB is defined as
SB ¼Xc
i ¼ 1
niðmi�mÞðmi�mÞT ð15Þ
where m is obtained using Eq. (10). T is the set ofeigenvectors corresponding to the largest eigenvalues in
SBwi ¼ giSW wi ð16Þ
However, the rank of SW is not more than ðM�cÞ,where M is the total training data set size and c is thetotal number of classes. SW will be non-singular only if itssize is lower than ðM�cÞ. But, since the size of SW isdetermined by the size of row-scanned PEI image of size d
(order of 100,000), it is much more than ðM�cÞ (order of1000). To solve this problem, PCA is applied first to reducethe dimension of training PEIs which keeps no more thanthe largest ðM�cÞ principal components such that SW isnon-singular. So, instead of applying LDA on the originalPEI data set x1,x2, . . . ,xM , we first apply PCA to get a set ofM d0-dimensional principal component vector y1,y2, . . . ,yM using Eqs. (10)–(12). d0 is chosen in a way such thatd0o ðM�cÞ and observed total variance is more than 90%.Then SB and SW are computed using Eqs. (14) and (15)(replacing x by y), and the eigenvectors are computedfrom Eq. (16). A maximum of ðc�1Þ eigenvectorsfn1,n2, . . . ,nc�1g are obtained which form the transforma-tion matrix TLDA. Thus, the ðc�1Þ-dimensional gait feature
vector is obtained from d0-dimensional principal compo-nent vector yk as follows:
zk ¼ ½n1,n2, . . . ,nc�1�T yk ¼ TLDAyk ¼ TLDATPCAðxk�mÞ, k¼ 1, . . . ,M
ð17Þ
The obtained feature vectors of dimension ðc�1Þ areused in the next stage for final recognition.
2.5. Human recognition by combining Pose Kinematics
and PEI
Since Pose Kinematics is obtained by simply determin-ing the number of frames belonging to each key posestate, it is quite fast whereas PEI’s discrimination power ismuch higher than Pose Kinematics. On the other hand, PEIrequires higher computational time and storage spacethan Pose Kinematics which makes it slower. To combinethe advantages of both the representations, we propose ahierarchical scheme for final recognition. Since PoseKinematics provides a fast yet comparatively weakerclassifier, we apply it in the first stage. Then, in the nextstage, PEI based classification is done.
Let the set of training PK feature vectors be denoted byfpg, corresponding PEI feature vector set is feg, and thetransformation matrix computed from the training dataset is T. The class centers of the training data are mpi ¼
ð1=niÞP
p2Pip, mei ¼ ð1=niÞ
Pe2Ei
e, where i¼ 1, . . . ,c, c is thenumber of classes (subjects) in the database, Pi is the setof PK feature vectors belonging to the ith class, Ei is theset of PEI feature vectors belonging to the ith class, and ni
is the number of feature vectors in Pi or Ei. Given the testsilhouette sequence F we follow the steps discussed inSections 2.1–2.3 to compute PK gait features PF ¼ fP1,P2, . . . ,PJg and sequence of PEI features EF ¼ fE1,E2, . . . ,EJg, where J is the number of compete gait cyclesextracted from the test sequence. Then, the transformedfeature vector sets are obtained as follows:
fEF g : Ej ¼ TEJ , j¼ 1, . . . ,J ð18Þ
At first, PK based classifier is applied for recognitionand the distance between the test PK feature and trainingPK data is defined as
DðPF ,PiÞ ¼1
J
XJ
j ¼ 1
JPj�mpiJ, i¼ 1, . . . ,c ð19Þ
Then the test sequence is classified as of subject a asfollows:
a¼ arg mini2C
DðPF ,PiÞ ð20Þ
if minci ¼ 1 DðPF ,PiÞ4y, where y is a threshold determined
experimentally. Since the nearest distance of the PoseKinematics based approach is low enough, its output classlabel is considered to be correct and final. Then, PEI is notapplied further in the next stage. Otherwise, the S nearestneighbors are selected from training sample spacedenoted as, P1, . . . ,PS, S5c. S is the rank of recognitionperformance when the average accuracy over all possibleprobe and gallery sets is higher than 90%. Thus, weessentially restrict the number of classes to be searchedusing PEI based feature. Then on these top S subject
A. Roy et al. / Signal Processing 92 (2012) 780–792 789
classes we apply PEI feature based method to get the finalclassification result. For the classifier based on PEI feature,we define
DðEF ,EiÞ ¼1
J
XJ
j ¼ 1
JEj�meiJ, i 2 S ð21Þ
Then the sequence is assigned to subject class b if
b¼ arg mini2S
DðEF ,EiÞ ð22Þ
This hierarchical method of feature combinationachieves both fast computation and high recognition rate.
3. Experimental results
We evaluated the proposed algorithm in varied challen-ging conditions such as variation in the size of data set,walking surface, walking speed, carrying condition, shoe type,camera angle, clothing, time, etc. The gait databases used forconducting experiments were the CMU MoBo (CMU) data-base [13] and the USF HumanID gait database [10]. We havecarried out experiments on a 2 GHz Intel Core2 Duo compu-ter, with 2 GB RAM, in Matlab 7.8 environment.
3.1. CMU MoBo database
The CMU MoBo database [13] consists of indoor videosequences of 25 subjects walking on a treadmill. Videoswere captured in different modes of walking, namely,slow walk, fast walk, walking on an inclined plane, andwalking with a ball in two hands. Each sequence is 11.33 slong, recorded at a frame rate of 30 frames per second. Allsilhouettes are vertically scaled, horizontally aligned andrescaled to 132�192. In this paper, all the four types ofwalking, i.e., ‘slow walk’ (S), ‘fast walk’ (F), ‘ball walk’ (B)and ‘inclined walk’ (I) are used for both gallery and probe.
Given a probe sequence, the first step is to classify eachsilhouette into one of the key poses. This is done byfollowing the steps described in Sections 2.1 and 2.2 (alsosee Fig. 1). Once the key pose labels are known for each
Table 1Recognition results on Mobo data set. S/F represents Gallery S and Probe F.
Experiments CMU [14] UMD [15] MIT [16] SSP [17]
S/S (%) 100 100 100 100
S/F (%) 76 80 64 54
S/B (%) 92 48 50 –
S/I (%) – – – –
F/S (%) – 84 – 32
F/F (%) – 100 – 100
F/B (%) – 48 – –
F/I (%) – – – –
B/S (%) – 68 – –
B/F (%) – 48 – –
B/B (%) – 92 – –
B/I (%) – – – –
I/S (%) – – – –
I/F (%) – – – –
I/B (%) – – – –
I/I (%) – – –
silhouette, the PK and PEI features are computed asdescribed in Section 2.3. Then dimension reduction isdone using either PCA or LDA. At last classification is donefollowing the steps discussed in Section 2.5. These stepsare shown in detail in Fig. 2.
Table 1 shows recognition performance comparing ouralgorithm against five existing methods. Existing methodsshow high recognition rates when gallery and probe setsare either the same or have small shape variation (trainwith S and test with F or train with F and test with S). Forthe other experiments, relatively low recognition rates areachieved, which indicates that those algorithms are notrobust enough to appearance changes. Results of theexperiments which are not reported, have been left blankin the table. In contrast, the performance of our algorithmacross all types of gallery/probe combinations shows thebest classification accuracy. The third last row showsrecognition accuracy with only Pose Kinematics feature.As expected, it can be observed that the recognition resultis not high enough. The second last row shows recognitionaccuracy with only PEI followed by PCA, which is higherthan any of the existing methods. After hierarchical com-bination of the two features, recognition accuracy is shownin the last row. In the first stage, the Pose Kinematics basedmethod selects the top 30% of gallery set on which PEI isapplied. The combined method achieves slightly betteroverall accuracy than only PEI based method. This occursin the following situation. Say, the test subject is actuallysubject A in the training data set. During only PEI basedrecognition, it could be classified as subject B if both havesimilar physical build. However, during Pose Kinematicsbased search space pruning in the combined method,subject B will not be selected because his kinematics doesnot match that of the test subject. Next, when PEI basedrecognition is applied on the reduced search space, testsubject is classified correctly as subject A (option of subjectB will not be there).
Table 2 shows the average time requirement forclassifying a subject using either Pose Kinematics or PEIor the combined method. The average accuracy in Table 2
FSVB [11] Pose Kinematics PEI Combination
method
100 100 100 100
82 32 100 100
77 84 92 92
– 68 60 60
80 52 88 88
100 92 100 100
61 28 64 60
– 64 72 72
89 80 92 92
73 36 84 84
100 92 100 100
– 60 60 76
– 76 60 76
– 48 80 80
– 40 32 48
– 100 100 100
A. Roy et al. / Signal Processing 92 (2012) 780–792790
is obtained by taking average of all accuracies for differenttypes of experiments performed in Table 1. As alreadystated, it can be observed from the result that the timerequirement using Pose Kinematics is low. On the otherhand, PEI requires 83% higher computational time thanPose Kinematics. After hierarchical combination of thetwo features, the time requirement is shown to bereduced by 18% compared to the PEI method alone.Although it seems small for one subject, when multiplesubjects have to be recognized, overall time saving will beconsiderable. This is especially true for surveillance appli-cations where raw video durations run into hours. It canalso be noted that in spite of time improvement, accuracyis not adversely affected. Thus the final result is notconstrained by the restricted search space obtained fromPose Kinematics feature.
3.2. USF HumanID database
We also conducted experiments on the USF HumanIDoutdoor gait database (Version 2.1) [10]. The databaseconsists of 1870 sequences of 122 subjects. For eachsubject, there are five covariates: view points (left/right),surface (grass/concrete), shoe (A/B), carrying conditions(with/without briefcase), and recording time and clothing(may/november). Depending on these covariates, 12experiments are designed labeled from A to L for testing.
Table 3 shows the comparative recognition results ofour algorithm combining the two proposed features withpreviously proposed GEI [3], EGEI [6] and AGEI [8]methods. GEI is obtained by averaging silhouettes over agait cycle. To compute EGEI, variation analysis is done tofind the dynamic region in GEI. Based on this analysis, adynamic weight mask is constructed to enhance thedynamic region and suppress the noise in the unimpor-tant regions. The gait representation thus obtained iscalled the EGEI. In AGEI, active or dynamic regions areextracted in a different manner. From a gait silhouette
Table 2Time required for each feature and combined method.
Approaches Time (s) Average accuracy (%)
Pose Kinematics 35.5 66.25
PEI 64.8 80.25
Combined method 54.03 82.75
Table 3Recognition results on USF gait data set.
Approaches A B C D E
GEI [3] PCA 80 87 72 26 22
LDA 88 89 74 25 25
EGEI [6] PCA 80 87 70 24 20
LDA 89 89 76 22 28
Active GEI [8] PCA 83 91 76 17 22
LDA 89 93 82 22 26
Proposed method PCA 81 93 76 47 31
LDA 85 94 78 49 33
sequence, the active regions are extracted by calculatingthe difference of two adjacent silhouette images. Then theAEI is constructed by accumulating these active regions.
Here we applied PCA as well as PCAþLDA dimension-ality reduction techniques for performance comparison. Itcan be observed from the table that for probes H, I and J(carrying briefcase), our approach gives lower perfor-mance than the AGEI method. On the other hand, AGEIperforms poorly on probes D, E, F and G when walkingsurface changes. The basic difference between these twocovariates is that while the first one affects the appear-ance, the other one affects the walking pattern. Ourproposed PEI captures dynamics in higher resolution thanAGEI. So, the performance of our method is less affectedby surface change compared to AGEI which fails tocapture dynamic variation when walking surface changes(probe set D–G). AGEI performs better in carrying brief-case situation (probe set H–J) because at the time ofcomputing AGEI, the stationary regions are deleted bytaking the difference between two adjacent silhouetteimages in a sequence. Since the briefcase regions arestationary between two adjacent images, they are notconsidered in AGEI. Thus, it suppresses appearance varia-tion by only concentrating on active regions.
However, goodness of any approach is measured con-sidering the performance over all the probes. According tothe weighted mean recognition results over all the 12probes, our PEI and Pose Kinematics based approachoutperforms all of the existing gait feature representationmethods. To judge the ranking capabilities of the pro-posed approach, we plot the cumulative match character-istic (CMC) curves for the 12 probes shown in Fig. 9. HerePCA is used as the dimension reduction method. Thecurve corresponding to each experiment represents var-iation in the probability of recognizing a subject withincrease in rank from 1 to 20. The weighted mean value ofaccuracy is also shown on the same plot. It can beobserved from the plot that the weighted mean accuracyalmost saturates (at 75–85%) beyond a rank value of 12.
4. Conclusions
In this paper, two new gait representation approachescalled Pose Kinematics and Pose Energy Image (PEI) areproposed to capture the motion information in detail.Compared with the other GEI based methods likeconventional GEI [3], EGEI [6], AGEI [8], PEI represents
F G H I J K L Mean
11 13 47 40 37 6 6 39.25
15 20 52 53 56 9 18 45.93
11 12 48 40 40 6 3 39.30
14 20 52 52 58 12 1 46.14
7 7 75 67 51 3 3 45.02
13 14 83 71 61 9 9 51.22
18 24 61 53 38 6 9 47.73
22 26 71 69 47 12 12 53.11
0 2 4 6 8 10 12 14 16 18 200
10
20
30
40
50
60
70
80
90
100
Rank
Rec
ogni
tion
Rat
e (%
)
Probe AProbe BProbe CProbe DProbe EProbe FProbe GProbe HProbe IProbe JProbe KProbe LMean
Fig. 9. Cumulative match characteristic curves of all the probe sets.
A. Roy et al. / Signal Processing 92 (2012) 780–792 791
minute motion information by increasing the resolutionof GEI. The second feature namely Pose Kinematicscaptures pure dynamic information without involvingshape of a silhouette. Since discriminative power of puredynamic information is not high enough, this feature isless robust. On the other hand, PEI preserves detaileddynamic information as well as shape information whichmakes it more discriminative. But, since the PEI feature isslower to compute, Pose Kinematics is used in the firststage to select a set of most probable classes based ononly dynamic information. Then, PEI based approach isapplied on these selected classes to get the final classifi-cation result. Thus, this hierarchical method of featurecombination proposed here achieves both accuracy andefficiency. Experimental results demonstrated that theproposed approach performs better than the other exist-ing GEI based and non-GEI based approaches. Here weapplied two classical techniques namely PCA and LDA fordimensionality reduction and discriminative featureextraction. In our future work, we will attempt to applyother recently proposed robust dimensionality reductionmethods to achieve higher accuracy.
Acknowledgment
This work is partially supported by the project grant1(23)/2006-ME TMD, Dt. 07/03/2007 sponsored by theMinistry of Communication and Information Technology,Government of India and also by Alexander von HumboldtFellowship for Experienced Researchers.
References
[1] N.V. Boulgouris, D. Hatzinakos, K.N. Plataniotis, Gait recognition: achallenging signal processing technology for biometrics identifica-tion, IEEE Signal Processing Magazine 22 (6) (2005) 78–90.
[2] A.F. Bobick, J.W. Davis, The recognition of human movement usingtemporal templates, IEEE Transactions on Pattern Analysis andMachine Intelligence 23 (3) (2001) 257–267.
[3] J. Han, B. Bhanu, Individual recognition using gait energy image,IEEE Transactions on Pattern Analysis and Machine Intelligence 28(2) (2006) 316–322.
[4] J. Liu, N. Zheng, Gait history image: a novel temporal template forgait recognition, in: Proceedings of IEEE Conference on Multimediaand Expo, 2007, pp. 663–666.
[5] Q. Ma, S. Wang, D. Nie, J. Qiu, Recognizing humans based on gaitmoment image, in: Eighth ACIS International Conference on SNPD,2007, pp. 606–610.
[6] X. Yang, Y. Zhou, T. Zhang, G. Shu, J. Yang, Gait recognition basedon dynamic region analysis, Signal Processing 88 (9) (2008)2350–2356.
[7] C. Chen, J. Liang, H. Zhao, H. Hu, J. Tian, Frame difference energyimage for gait recognition with incomplete silhouettes, PatternRecognition Letters 30 (11) (2009) 977–984.
[8] E. Zhang, Y. Zhao, W. Xiong, Active energy image plus 2DLPP for gaitrecognition, Signal Processing 90 (7) (2010) 2295–2302.
[9] A. Kale, A. Sundaresan, A.N. Rajagopalan, N.P. Cuntoor, A.K.Roy-Chowdhury, V. Kruger, R. Chellappa, Identification of humansusing gait, IEEE Transactions on Image Processing 13 (2004)1163–1173.
[10] S. Sarkar, P.J. Phillips, Z. Liu, I. Robledo-Vega, P. Grother,K.W. Bowyer, The human ID gait challenge problem: data sets,performance, and analysis, IEEE Transactions on Pattern Analysisand Machine Intelligence 27 (2) (2005) 162–177.
[11] S. Lee, Y. Liu, R. Collins, Shape variation-based frieze pattern forrobust gait recognition, in: Proceedings of IEEE Conference onCVPR, 2007, pp. 1–8.
[12] L.R. Rabiner, A tutorial on hidden Markov models and selectedapplications in speech recognition, Proceedings of the IEEE, vol.77(2), 1989, pp. 257–286.
A. Roy et al. / Signal Processing 92 (2012) 780–792792
[13] R. Gross, J. Shi, The CMU Motion of Body (MoBo) Database,Technical Report CMU-RI-TR-01-18, Robotics Institute, CarnegieMellon University, 2001.
[14] R. Collins, R. Gross, J. Shi, Silhouette-based human identificationfrom body shape and gait, in: International Conference on Auto-matic Face and Gesture Recognition, 2002, pp. 351–356.
[15] A. Veeraraghavan, A.R. Chowdhury, R. Chellappa, Role of shape andkinematics in human movement analysis, in: Proceedings of theIEEE Conference on CVPR, 2004.
[16] L. Lee, W. Grimson, Gait analysis for recognition and classification,
in: Proceedings of the International Conference on Automatic Face
and Gesture Recognition, 2002, pp. 155–162.[17] C. BenAbdelkader, R.G. Cutler, S. Davis, Gait recognition using
image self-similarity, EURASIP Journal on Applied Signal Processing
2004 (4) (2004) 572–585.[18] M. Turk, A. Pentland, Eigenfaces for recognition, Journal of Cogni-
tive Neuroscience 3 (1) (1991) 71–86.