20
Abstract—Gait-based person identification suffers from the problem of different covariate factors such as clothing and carrying objects, which drastically reduce the recognition rate. Most existing methods capture dynamic and static information and remove the covariate factors without any systematic study. However, it has been reported in the literature that the head is one of the important features and the removal of the head from static information decreases the recognition rate. In our preliminary study, we developed a novel random walk (RW)- based gait extraction method that retains the head portion and removes certain static body parts to reduce the effect of covariate factors. The RW-based method is a novel gait feature extraction method and should be exploited more for its discriminative power to separate different body parts efficiently. However, the dynamic part is also significant in gait information, which is not very effectively represented in the RW-based gait extraction method. Therefore, a discrete Fourier transform (DFT)-based frequency Priyanka Chaurasia is a research associate at Connected Health Innovation Centre, School of Computing and Maths, Ulster University, Jordanstown BT37 0QB, United Kingdom ([email protected] ). Pratheepan Yogarajah is a lecturer in Computing Science at the Intelligent Systems Research Centre, School of Computing & Intelligent Sys, Ulster University, Derry~Londonderry BT48 7JL, United Kingdom ([email protected] ). Joan Condell is a senior lecturer at the Intelligent Systems Research Centre, School of Computing & Intelligent Sys, Ulster University, Derry~Londonderry BT48 7JL, United Kingdom ([email protected] ). Girijesh Prasad is a professor at the Intelligent Systems Research Centre, School of Computing & Intelligent Sys, Ulster University, Derry~Londonderry BT48 7JL, United Kingdom ([email protected] ). component of the gait is considered to represent the dynamic part of gait information. Further, we propose a novel gait recognition algorithm that fuses dynamic and static information from DFT- and RW-based representations. The proposed method systematically retains the discriminative static gait information along with the frequency attribute embedded as the dynamic gait information. Extensive experiments on the CASIA and the HumanID data sets have been carried out to demonstrate that the proposed fused gait features-based approach outperforms the existing methods, particularly when there are substantial appearance changes. Index Terms—gait recognition, covariate factors, random walk, discrete Fourier transform. I. INTRODUCTION ERSON identification research is in a journey for a robust system operating in real security and surveillance applications. The way an individual normally walks is one of those distinctive features that could be captured at a distance without the subject’s cooperation from low-resolution surveillance videos. These key features of the gait make it ideal to be used in surveillance applications [1]. Recent interests shown in developing more robust gait-based identification systems for surveillance applications have led to significant advancements in this area. P The gait-based recognition process can be divided into three major categories: spatiotemporal based, model based, and appearance based. Spatiotemporal-based methods uncover gait shape variation Fusion of Random Walk and Discrete Fourier Spectrum Methods for Gait Recognition Priyanka Chaurasia, Member, IEEE, Pratheepan Yogarajah, Member, IEEE, Joan Condell, and Girijesh Prasad, Senior Member, IEEE

INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

Embed Size (px)

Citation preview

Page 1: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

Abstract—Gait-based person identification suffers from the problem of different covariate factors such as clothing and carrying objects, which drastically reduce the recognition rate. Most existing methods capture dynamic and static information and remove the covariate factors without any systematic study. However, it has been reported in the literature that the head is one of the important features and the removal of the head from static information decreases the recognition rate. In our preliminary study, we developed a novel random walk (RW)-based gait extraction method that retains the head portion and removes certain static body parts to reduce the effect of covariate factors. The RW-based method is a novel gait feature extraction method and should be exploited more for its discriminative power to separate different body parts efficiently. However, the dynamic part is also significant in gait information, which is not very effectively represented in the RW-based gait extraction method. Therefore, a discrete Fourier transform (DFT)-based frequency component of the gait is considered to represent the dynamic part of gait information. Further, we propose a novel gait recognition algorithm that fuses dynamic and static information from DFT- and RW-based representations. The proposed method systematically retains the discriminative static gait information along with the frequency attribute embedded as the dynamic gait information. Extensive experiments on the CASIA and the HumanID data sets have been carried out to demonstrate that the proposed fused gait features-based approach outperforms the existing methods, particularly when there are substantial appearance changes.

Index Terms—gait recognition, covariate factors, random walk, discrete Fourier transform.

I. INTRODUCTION

ERSON identification research is in a journey for a robust system operating in real security and surveillance

applications. The way an individual normally walks is one of those distinctive features that could be captured at a distance without the subject’s cooperation from low-resolution surveillance videos. These key features of the gait make it ideal to be used in surveillance applications [1]. Recent interests shown in developing more robust gait-based

P

Priyanka Chaurasia is a research associate at Connected Health Innovation Centre, School of Computing and Maths, Ulster University, Jordanstown BT37 0QB, United Kingdom ([email protected]).

Pratheepan Yogarajah is a lecturer in Computing Science at the Intelligent Systems Research Centre, School of Computing & Intelligent Sys, Ulster University, Derry~Londonderry BT48 7JL, United Kingdom ([email protected]).

Joan Condell is a senior lecturer at the Intelligent Systems Research Centre, School of Computing & Intelligent Sys, Ulster University, Derry~Londonderry BT48 7JL, United Kingdom ([email protected]).

Girijesh Prasad is a professor at the Intelligent Systems Research Centre, School of Computing & Intelligent Sys, Ulster University, Derry~Londonderry BT48 7JL, United Kingdom ([email protected]).

identification systems for surveillance applications have led to significant advancements in this area.

The gait-based recognition process can be divided into three major categories: spatiotemporal based, model based, and appearance based. Spatiotemporal-based methods uncover gait shape variation information in both spatial and temporal domains. Shape variation-based frieze features [3] and space-time interest points-based gait feature [4] are examples of spatiotemporal-based methods. The model-based methods aim to model the body and shape of a person when he/she is walking [5]. Nevertheless, the associated cost of building model-based methods is relatively high, which makes it impractical in scenarios where the cost is a concern [5]. Appearance-based methods are cost effective and focus on extracting the static (e.g. head and torso) and/or dynamic (e.g. motion of each arm, hand, and leg) information of a walking human from a sequence of binary silhouettes [6, 7]. Thus, appearance-based methods are more suitable for surveillance applications.

Although appearance-based methods are cost effective and can work in low-resolution videos, they are sensitive to the variation occurring in human appearances. This is primarily due to the fact that body-related parameters (head and torso) are not very robust, as they are dependent on clothing, carrying bags, and other factors [8]. As a consequence, the performance of gait recognition systems deteriorates due to varying covariate conditions. To make appearance-based identification systems practical for surveillance applications, it is critical to solve the issues related to varying human appearances. Covariate factors can be either related to the subject itself (e.g. carrying bags and different clothing conditions) or to the environment (e.g. different walking surface) [8]. Another kind of covariate can be generated by fluctuations occurring in the phases between a matching pair of gait image sequences because of temporal variability [9]. Body-related covariate factors such as clothing and a carrying bag that are mostly attached to the upper body part cause a change in the appearance of an individual. As a consequence, the resulting gait feature contains irrelevant information mostly attached to the static part primarily consisting of the torso region. Consequently, this irrelevant information should be judiciously removed to improve the performance of gait recognition methods.

Appearance-based gait recognition methods use static features [6], dynamic features [7], or fusion of static and dynamic features [2] for recognition. However, due to the changes in appearance caused by body-related covariate factors, the torso region is likely to include covariate factors that thicken the torso region (i.e. make the torso region

Fusion of Random Walk and Discrete Fourier Spectrum Methods for Gait Recognition

Priyanka Chaurasia, Member, IEEE, Pratheepan Yogarajah, Member, IEEE, Joan Condell, and Girijesh Prasad, Senior Member, IEEE

Page 2: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

bulkier). Such covariate factors have been mostly found to be associated with the static parts of the subject (i.e. the torso). Conversely, most of the dynamic parts of the subject (i.e. hands, legs, and feet) are less affected by covariate factors and produce reliable gait information that can be used for identification purposes. Therefore, as a consequence, certain regions of the static parts (mostly torso) are required to be removed. Additionally, in the literature, it has been suggested that parts-based gait features have better accuracy in comparison to full body-based gait feature in the case of clothing covariates [10].

With the objective of bringing gait-based identification into real security and surveillance applications, our work is focused on developing gait representation methods that are free from body-related covariate factors. These methods have been developed with the view to construct invariant gait features for the same individual even when that individual appears with different clothing and carrying bag conditions. We aim to identify and remove/reduce the effect of body-related covariate factors for improved gait recognition. In our first attempt to use RW for segmenting gait silhouettes into different regions, a preliminary study was carried out in [11]. The RW-based method is used for image segmentation and the segmentation problem is solved using Poisson’s equation. We call the resulting feature PRWGEI, which is a Poisson’s equation-derived RW-based gait energy image (GEI). The PRWGEI feature was developed to remove certain static body parts to reduce the effect of covariate factors. The PRWGEI gait feature has an improved recognition rate of 78.6%, which is higher than any other existing methods proposed in the literature for handling body-related covariate factors and tested on the Chinese Academy of Sciences, Institute of Automation (CASIA)-B data set. The RW-based thickness characteristic allows better identification of regions that are affected by covariate factors such as carrying an object or wearing heavy clothing and can be used to easily remove these factors from the normal body regions of a human body. The resulting PRWGEI gait feature significantly retains static information, including the head, along with a reduced effect of covariate factors in the torso. However, the regions capturing the motion are not effectively represented. Conversely, the dynamic information is also a substantial part of the gait, and Bertenthal et al. [12] identified that the frequency component is a significant property in the human perception of the gait. Following a gain of significant advantage in obtaining a covariate-free static feature, in this paper, we consider the dynamic regions of the gait and propose a methodology that merges covariate-free static and dynamic features of the gait for gaining higher recognition rate. Thus, to have better representation of dynamic regions, we incorporate a DFT-based gait feature, discrete Fourier energy image (DFEI), into our existing PRWGEI feature. The DFEI feature emphasizes the dynamic regions of the gait, whereas PRWGEI represents covariate-free static regions. The novelty of our work lies in using a covariate-free static part and merging it with the dynamic part of the gait. In contrast to our approach, earlier gait recognition approaches have fused static and dynamic

parts. However, getting a covariate-free upper body part that significantly retains static information such as the head has been ignored, which is very important for accurate recognition.

The main contributions of our paper can be summarized as follows: (1) The existing literature lacks a systematic study of covariate removal. We present a systematic way of retaining the discriminative static and dynamic body parts, (2) obtaining dynamic information on the lower body part using the frequency component of the gait, and (3) combining the appropriate static and dynamic parts of gait information to obtain a novel gait feature that has covariate-free static body parts along with dynamic information. The evaluation of the proposed methodology is performed on two standard gait data sets: the CASIA-B data set [13] and the HumanID Gait Challenge data set available from the University of South Florida (USF) [8]. As a benchmark for comparison of our proposed gait feature, the obtained results are compared to the results of other appearance-based gait features previously proposed for these data sets. In this paper, we present an extended description of the novel gait feature extraction method, PRWGEI, to deal with appearance variations by retaining static parts, its extension to capture the dynamic characteristics by combining it with frequency domain feature via DFT, its application to deal with a more challenging, close to real-world gait data set, and its advantages gained over existing methods considering the same challenge.

II.RELATED WORK

In appearance-based methods, GEI [12] is a commonly used approach to represent human motion sequences. Gait recognition using the GEI feature under normal conditions achieves an improved recognition rate. However, under the effect of covariate factors, such as clothing and carrying bag, the use of the GEI feature does not attain a good recognition rate. This is due to the fact that appearance-based methods become sensitive to the variation occurring because of different clothing and carrying bag conditions. Instead of GEI, there are several appearance-based gait features in the literature to reduce the effect of covariate factors. Gait feature representation methods, such as gait entropy image (GEnI) [12], enhanced GEI (EGEI) [17], active energy image (AEI)

[18], masked gait image ( M G

ij )

[19], and gait flow image

(GFI) [20], have been reported to reduce the effect of different clothing and carrying bag covariate factors (Fig. 1). However, certain important issues are not adequately addressed. For instance, it can be seen in Fig. 1 that the major parts of bag and coat that should have been removed still remain in the resulting gait features. Additionally, in all these methods, the static parts are unsystematically removed and almost all the upper body parts are lost, including the head, which is significantly important. These methods try to remove the static parts under the assumption that the covariate factors such as different clothing and carrying bags are attached to the upper body part. However, as can be seen in Fig. 1, although the

Page 3: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

static parts are removed, the resulting gait features still retain the covariates attached to the upper body part. The random removal of static information such as the head reduces the recognition rate. As a result, these methods produce lower recognition rates for different clothing covariate gait sequences and reasonable recognition rate for carrying bag covariate gait sequences.

Fig. 1. Different gait representation methods for a particular individual for normal, carrying bag, and different clothing walking conditions from the CASIA-B gait data set.

Matovski et al. [15] reported that different clothing conditions drastically affect the performance of appearance-based methods in comparison to other covariates such as footwear, speed, and viewpoint. The experimental results detailed in [16] also reported that covariate factors such as different clothing and carrying bag conditions highly reduce the gait recognition rate. The gait features used for appearance-based recognition can be broadly categorized into two categories, namely, static and dynamic. In a recent research, Wang et al. [5] proved that a promising recognition rate could be achieved using static gait features. On the contrary, Cutting and Proffitt [14] argued that dynamic features contribute significantly more in individual recognition than static features. Instead, Lam et al. [2] have preferred to fuse both static and dynamic cues with a belief that the fusion would yield an optimal gait recognition rate. Altab Hossain et al. [10] proposed a method based on known anatomical body properties to study the influence of body parts on gait recognition. The study revealed that the parts-based gait feature representation methods are more discriminative. The recognition rates are improved over clothing covariate in comparison to including the full body-based gait feature representation methods. Li et al. [21] analyzed the effects of the removal of different body parts of the GEI on gait recognition. In their evaluation, they showed that the removal of certain body parts might increase or decrease the recognition rate. For example, the removal of the head decreases the recognition rate, whereas the removal of thighs increases the recognition rate. Bertenthal et al. [22] identified that the frequency components are also an important property in the human perception of gaits. These frequency components reflect the frequency features of an individual’s pose in the current frame.

GFI is developed by Bashir et al. [20] based on the optical flow fields that capture both motion intensity and motion

direction. So far, the proposed GFI shows better recognition results for carrying and clothing covariate factors on the CASIA-B gait covariate data set than other methods discussed in the literature. Nevertheless, there are some important issues that are not adequately accounted for. The static parts are removed from the GFI without any systematic study of the influence of the removal of the body parts on gait recognition. Also, in the GFI, the frequency information representing the frequency features of a person’s pose in the current frame is ignored. The GFI feature has an average recognition rate (76.6%) lower than our earlier reported PRWGEI feature (78.6%). However, the dynamic parts in our feature are not effectively represented. Therefore, to further improve the gait recognition accuracy, we have been motivated to enhance the dynamic information of the gait. The enhancement of dynamic information into our existing PRWGEI feature results in another novel feature that has more efficient representation for both static and dynamic parts of the gait.

III. GAIT FEATURE EXTRACTION

Given a human walking sequence, the silhouettes are extracted using a Gaussian model-based background estimation method [23]. The extracted silhouettes are resized to a fixed size of 128×100 pixels. The purpose of resizing is to eliminate the scaling effect. Each of the resized silhouette images is aligned horizontally with respect to its horizontal centroid. After applying the horizontal alignment, the gait period is estimated for an individual’s walking sequences.

A. Gait Period Estimation

A single gait cycle can be considered as the period in which an individual moves from a mid-stance (both legs are close together) position to a double-stance position (both legs are far apart), then the mid-stance position, followed by the double-stance position, and finally back to the mid-stance position and vice versa [8]. The gait period can then be estimated by calculating the number of foreground pixels in the lower one-third of a silhouette image. In the mid-stance position, the silhouette image contains the minimum number of foreground pixels [8]. In the double-stance position, the silhouette contains the maximum number of foreground pixels. The gait period is then calculated using the distance between the three consecutive minima or maxima.

B. Covariate-Free Static Feature Extraction

We obtain the static part for the proposed fused gait feature template from the PRWGEI gait feature described in our previous work [11]. The PRWGEI gait feature is based on using the RW [24] for image segmentation and is solved using the Poisson’s equation. The Poisson’s equation is applied to each of the gait binary silhouettes in the estimated gait cycle and is used to separate different body parts. The extracted binary silhouette has a human object, S, which is surrounded by a closed contour, ∂S, as its boundary. In the RW approach, each pixel in the silhouette is assigned a particular value. For each

Page 4: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

pixel (x,y) ∈ S, a mean time, U(x,y), is calculated that represents the estimated number of steps taken starting from the given pixel to hit the boundary. The value of U(x,y) is computed recursively as follows: (1) At the boundary of S, U(x,y) = 0, and (2) for all the points within S, U(x,y) is equal to the mean value of its four immediate neighbors and a constant that represents the time taken to reach the immediate neighbor and is given as

U ( x , y )=14

¿ [U ( x+1 , y )+U ( x−1 , y )¿ ]¿¿

¿(1)

The constant 1 in (1) indicates that the speed of RW is one pixel per time. The RW algorithm involves a large number of random paths; therefore, the computational cost to stimulate the RW algorithm is high. To solve (1), the Poisson’s equation is used and is given as

{ ΔU=v in S

U=w on ∂ S(2)

where

ΔU=∂2U ( x , y )

∂ x2 +∂2 U ( x , y )

∂ y2

, v and w are the known functions, and Δ is a 2D Laplace operator. The discrete form of the partial differential equation, ΔU, can be represented using a five-point finite difference method as in [25]:

U ( x , y )=14

¿ [U ( x+1 , y )+U ( x−1 , y )+ ¿ ]¿¿

¿(3)

where h is the distance between two pixels, and in our case, it is equal to 1 as we consider immediate neighbors. Equations (1) and (3) are used to get the solution of (2) with the Dirichlet boundary condition as in [24]:

{ΔU ( x , y )=−4 if( x,y )∈S

U ( x , y )=0 if( x,y )∈∂S . (4)There are different methods to solve (4), such as cyclic reduction, multigrid methods, and successive over-relaxation [25]. In our work, we used successive over-relaxation method to solve (4). The Poisson’s equation can be used to extract a variety of beneficial properties of a binary silhouette. In our case, we used U and the gradient of U to separate different parts of the human based on their thickness characteristic. A function, Φ, is defined as

Φ (x , y )=U ( x , y )+‖∇ U ( x , y )‖2 . (5)

The function, Φ, has a distinctive characteristic to separate different parts based on the thickness characteristic. However, to get a better separation for thresholding, it is desirable to separate the intensity values for the different parts better. For doing this, first the logarithm of Φ is taken as Ψ = log(Φ). The Ψ is then scaled to make the intensity value range between 0

and 255 for doing the grayscale image normalization. As shown in Fig. 2(c), the values of Ψ are much more strongly divided into the torso part and the other parts, making it easier to remove the torso part. Following the calculation and scaling of Ψ, a new binary silhouette with reduced covariate effects is obtained and is called PRW silhouette (PRWsil):

PRW sil={1 if Ψ ( x , y )<θ0 otherwise (6)

where θ is a threshold value selected to separate different body parts based on the intensity values.

Fig. 2: (a) Binary silhouette. (b) U of (a). (c) Computed Ψ of (a).

Our aim is to include the head information and remove much of the torso part that contains covariate factors. The threshold value of the thickness characteristic function is preferably selected so that it separates the torso region from the rest of the subject region more accurately. The threshold value may be a predetermined value (i.e. determined before the evaluation of the thickness characteristic function) or it may be determined after the analysis of the thickness characteristic function. Based on the empirical evaluation, the results showed that θ = 160 provided better results. Fig. 3 illustrates the PRWsil obtained for different values of θ applied to a Ψ image.

Fig. 3. Left represents the Ψ and the other images represent the PRWsil

obtained for different values of θ.

The PRWsil is obtained for each frame in the estimated gait cycle of a given video sequence. The final gait feature template, PRWGEI, is a Poisson’s equation-derived RW-based GEI, which is computed as an average of the set of the PRWsil

obtained from the corresponding frames in a gait cycle of a given video sequence as follows:

PRW GEI( x , y )= 1N ∑

n=1

N

PRW siln ( x , y )

(7)

where PRW sil

n

is a PRWsil corresponding to the nth frame in a particular gait cycle and N is the total number of frames in a gait cycle. The N value varies based on the varying number of frames in a gait period of a given video sequence. Following this, the resulting PRWGEI is scaled to make its pixel intensity

Page 5: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

value range between 0 and 255. It is to be noted that the RW-based gait feature template, PRWGEI, is a grayscale image.

1) Advantages of Using the RW Approach

In normal (non-covariate) conditions, the torso provides useful gait information. Removing the torso parts of the silhouettes therefore removes some useful gait identification information, but the disadvantages of this are outweighed by the advantages of removing the covariate factors. The thickness characteristic, Φ, of the RW-based approach allows a better identification of regions of the images, which are most affected by body-related covariate factors. RW is preferred over other functions [such as the distance transform (DT) function], as it grows quickly away from the boundary of the subject region and therefore allows easier thresholding to separate thicker regions from less thick regions. This allows the identified boundary of the thick region to be closer to the actual subject region boundary, thereby allowing the removal of a greater portion of the torso region and possible covariate factors. A comparative illustration of the RW- and DT-based thickness characteristic function is shown in Fig. 4. It can be seen that Fig. 4(b) has a steeper gradient in the torso region. This makes the thresholding step better at removing the torso and keeping the rest of the subject region. However, that is not the case in Fig 4(d), as distinguishing torso from the other regions using thresholding is a difficult task.

Fig. 4. Thickness characteristic function: (a) U and (c) DT and the corresponding logarithm of the thickness characteristic function, Ψ = log(Φ), in (b) and (d).

Previous attempts at segmenting GEI have only identified different parts of the subject in a crude manner. Such methods can, for example, identify a general torso region, but the crude approximation may miss parts of the torso and will most likely also miss static parts of the covariate factors such as the bottom of a long coat or a bag carried by the side. The RW-based approach instead analyses each of the binary silhouettes to identify the thicker parts of the silhouettes (i.e. the parts having more bulk associated with them). Body-related covariate factors will generally combine with the torso parts and together they form the bulkier parts of the silhouettes. Thus, removing the thicker regions removes more of the effects of the covariate factors. Furthermore, previous attempts to remove covariate factors have identified segments of the GEI. The GEI is an averaged image based on the combination of all the silhouettes in the sequence. The averaging effect of the GEI means that it is not easy to identify thick or bulky parts of the image. In contrast, the RW-based approach analyzes each of the silhouettes in the given sequence separately and calculates and applies the thickness characteristic to each silhouette to identify the thicker, more static parts of the subject. This allows better tracking and thus

better removal of the covariate factors, which can often change position relative to the subject from frame to frame. It can be further argued that if the static part of the head is significantly important, it is better to just cut the head from the GEI and fuse it with the DFT. However, cutting the head from the GEI may not always result in obtaining a clean head specifically in the case of heavy clothing with a hoody attached near the neck. Additionally, locating the head manually would be a tedious process, and in any case, an automated approach is required for building a gait-based person identification system targeted for surveillance application, where a large volume of data is required to be processed. Fig. 5 shows the PRWGEI features for an individual under the normal, carrying bag, and different clothing conditions from the CASIA-B database [3]. It can be seen in Fig. 5 that the RW-based feature extraction is effectively able to separate the covariate factors and retain significant static parts such as the head. However, in comparison to the head portion, the lower body part representing the dynamic part of the gait has lower pixel intensity values. As a result, the dynamic information is not efficiently represented in the PRWGEI gait feature. Therefore, to effectively represent the dynamic information of the gait, next the DFT-based gait feature representation is explored.

Fig. 5. PRWGEI for an individual from the CASIA-B gait database [3]. The columns from left to right represent PRWGEI features for normal, different clothing, and carrying objects covariate factors.

C.DFT-Based Dynamic Feature Extraction

Following the work described in [26], we incorporate the frequency component of the gait and develop a DFT-based gait feature DFEI. The Fourier analysis of the gait signals is very appealing, as most discriminative information can be compacted in a few Fourier coefficients and result in a very efficient gait representation [26]. A 1D Fourier transform for each pixel of the binary silhouettes in a gait cycle is computed along the frame axis as in [26]:

Dk ( x , y )=∑n=1

N gait

Sn ( x , y )e− j ω0 kn

(8)where Sn is a silhouette of the nth frame in a gait cycle, x and y are the image coordinates of the nth frame, ω0 = 2π/Ngait is the base angular frequency for the gait cycle consisting of Ngait

frames, and Dk(x,y) is the DFT for k times the base frequency and k = 0, 1, 2, 3, . The work in [26] considers the magnitude and phase of the DFT to represent the gait features. Because in our work we aim at extracting the motion intensity of a walking human as a dynamic part, we consider the magnitude of the DFT, which is calculated as follows:

Page 6: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

Ak( x , y )= 1N gait

|Dk (x , y )|(9)

where Ak(x,y) is a magnitude spectrum for the Dk(x,y) normalized with the number of frames, Ngait. Then, Ak(x,y) is scaled to make its values range between 0 and 255 for doing the grayscale image normalization. The work in [26] considers the camera viewing angle as a covariate factor and aims at normalizing the effect of camera view direction using the frequency components. It is found that not all the frequency components are useful in representing the gait features. Therefore, the DFT-based gait features for view-invariant gait recognition approach is represented using k value ranging from 1 to 5 in [26].

In our case, different clothing and carrying bags are considered as covariate factors. Therefore, the appropriate k value that best represents our case is chosen based on the experimental and logical evaluation of the gait feature. We take a random individual from the CASIA-B data set and extract the DFT-based gait feature, Ak(x,y), using the k value ranging from 0 to 5. Fig. 6 illustrates the DFT-based gait feature obtained for the different values of k for a particular individual from the CASIA-B data set [25]. In Fig. 6, (a) to (f) show the DFT-based gait feature (using the magnitude of the DFT), Ak(x,y), for the frequency component, k. The DFT-based gait feature for k = 0 represents the GEI [Fig. 6(a)]. As seen in Fig. 6, different values of k provide varying dynamic information of the gait feature. In our work, the main aim is to reduce the effect of body-related covariate factors from the given gait sequence and also retain as much dynamic information as possible. Hence, the DFT-based gait feature, A2(x,y), best represents our case, where the lower body part has higher pixel intensity values and reduced effect of covariate factors on the upper body part [Fig. 6(c)].

Fig. 6. (a–f) DFT-based gait features using k = 0, 1, 2, 3, 4, and 5 for a particular individual from the CASIA-B gait database, respectively. Rows from top to bottom represent the normal, carrying bag, and wearing coat conditions.

As further confirmation, an experimental evaluation is carried out to find the best value of k that will help in extracting the most suitable gait feature. The DFT-based gait feature, Ak(x,y), is a 2D image and represents the gait features in a frequency domain for different values of k. To find a suitable Ak(x,y) that best represents the gait feature, the sum of Ak(x,y) is computed as

SOAk=∑x∑

yAk( x , y )

(10)for each k value. The computed value of SOAk represents the total gait information contained in the corresponding Ak(x,y) gait feature. Fig. 7 shows the SOAk plotted against the k values ranging from 0 to 10 for the normal, carrying bag, and clothing conditions for a random subject picked from the CASIA-B data set. It can be seen in Fig. 7 that, for k = 0, all the cases of normal, carrying bag, and clothing conditions, the SOAk values have the highest values. However, as discussed earlier, the DFT-based gait feature for k = 0 represents the GEI with all the covariate factors attached to the upper body [Fig. 6(a)]; therefore, k = 0 cannot be considered for extracting the most suitable feature. Next, for k = 1, the SOAk value is much less compared to other k values, indicating that not much gait information is retained in the resulting A1(x,y). Now, in comparison to other k values (k = 3, 4, , 10), k = 2 has the second highest SOAk value as can be seen in Fig. 7. Thus, k = 2 gives higher gait information apart from k = 0.

Fig. 7. Sum of Ak(x,y) plotted against k values ranging from 0 to 10.

Additionally, in Fig. 6(c), it can be seen that, for k = 2, the resulting gait feature, Ak(x,y), has reduced covariate factors on the upper body part and provides higher dynamic information on the lower body part. Therefore, k = 2 is chosen as the optimum frequency component for obtaining the DFT-based gait feature, Ak(x,y). Hence, for k = 2, the corresponding A2(x,y) is considered as our final DFEI gait feature template. The DFEI gait feature template represents the dynamic component of the gait with higher-intensity values along with reduced effect of covariate factors in static information. However, the static head information is significantly lost in the resulting DFEI gait feature template [Fig. 6(c)]. Therefore, to achieve the advantage of both PRWGEI and DFEI gait feature templates, our final gait feature is obtained by fusing both of them, which is described next.

D.Fusion of PRWGEI and DFEI Gait Features

Our aim is to retain the maximum gait information (static and dynamic) and reduce the effect of body-related covariate factors. The PRWGEI retains the head portion along with reduced effects of body-related covariate factors on the upper body part, whereas the DFEI represents the dynamic components with higher-intensity values in the lower body part. Therefore, as a goodness-of-fit, it would be beneficial to fuse the upper part of PRWGEI and the lower part of DFEI at

Page 7: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

feature level to obtain a final gait feature template referred to as PRWDFGEI. There can be various alternatives on deciding the portion of the PRWGEI and DFEI to be fused together. For example, taking two-thirds from the PRWGEI and one-third from the DFEI. In the current work based on the empirical evaluation, we combine the upper half of the PRWGEI with the lower half of the DFEI under the assumption that the given video sequence has the walking human in an upright position and the extracted silhouettes contain the head in the upper part and the legs in the lower part. Thus, the resulting PRWDFGEI has static information (head) mostly present in the upper body part from the PRWGEI and the dynamic information (legs and feet) present in the lower body part from the DFEI. The fused gait feature template, PRWDFGEI, is illustrated in Fig. 8(c), where the upper half is taken from the PRWGEI and the lower part from the DFEI. Note that the PRWDFGEI is programmatically calculated in an automated process. For N number of frames in a gait cycle, the PRWGEI and DFEI features are calculated separately, resulting in two features per gait cycle. Following this, the gait-based person identification system obtains the final PRWDFGEI gait feature template for a given video sequence by combining the upper half of PRWGEI and the lower half of DFEI.

Fig. 8. Fused gait feature template extraction: (a) upper half of the PRWGEI, (b) lower half of DFEI, and (c) final gait feature PRWDFGEI.

IV. EVALUATION

The given video sequence is preprocessed to extract the foreground images followed by cropping, normalizing, and resizing the images to a fixed size bounding box of 128×100. After resizing, the number of frames in an individual’s one full-walk sequence is determined to obtain the corresponding gait cycle. The gait sequences are represented as desired gait feature templates, and gait recognition is carried out by matching a probe gait feature template to the gallery gait feature template that has the minimal distance to the probe gait feature template. The gallery refers to the training data and the probe refers to the test data. The gallery set has C different classes representing C individuals with M,N-dimensional gait features for each individual. The extracted gait feature templates are in the form of column vector obtained by concatenating the rows of the corresponding gait feature. In our case, the vectorization of a gait feature template of size 128×100 results in 12,800 dimensional data, which is very high. Therefore, a dimension reduction technique is required to be applied, which reduces the high-dimensional space to a space of fewer dimensions. Some of the commonly used dimension reduction techniques are principal component analysis [5], linear discriminant analysis (LDA) [28],

generalized discriminant analysis (GDA) [34], and general tensor discriminant analysis [29].

A. Classifier

For carrying out the classification of the given probe gait feature template, 1-nearest neighbor (1-NN) classifier [30] with Euclidean distance as a matching measure is used. The gait recognition system database consists of Ngallery training subjects {x1, , xM} of M observations, where xi represents a gallery gait feature template reshaped as a column vector. The recognition is carried out by obtaining a set of gallery gait

feature templates for each class {~x 1 ,~x 2 ,. . . ,~x C},

wherein

there are C classes representing C individuals and then finding their corresponding projections onto a reduced subspace, {~z 1 ,~z 2 ,. . . ,~z C},

using a dimension reduction technique.

For a probe gait feature template, ~x ,

its projection,

~z , on the

reduced subspace is calculated using the same dimension reduction method as used for the gallery set. Then, the

Euclidean distance between the reduced subspace, ~z ,

and

each element of {~z 1 ,~z 2 ,. . . ,~z C}

is calculated. The recognition

is carried out by assigning the probe gait feature template to a class, i, from the gallery gait feature templates for which the calculated Euclidean distance is minimum (i.e. the probe has the closest distance once projected onto the reduced subspace):

i=argminj

‖~z −~z j‖.(11)

For evaluating the merits of the proposed gait feature for gait recognition, the proposed methodology is evaluated on the two most standard gait data sets CASIA [13] and HumanID Gait Challenge data set. Before carrying out the evaluation of the proposed PRWDFGEI, the initial evaluation of the DFT-based gait feature, DFEI, is separately done. This helps in assessing the merits of using the frequency component of the gait and doing a systematic study of the influence of the frequency component in gait recognition. The evaluation of the DFEI gait feature is carried out using the CASIA-B data set only.

B. Evaluation on CASIA Data Set

As our work is based on covariate factors, we used the CASIA-B for our experiments. The CASIA-B is a large covariate gait data set of 124 subjects. In our experiments, we used frontoparallel view sequences (i.e. the camera is perpendicular to the walking direction) with normal, different clothing, and carrying bag conditions. For each subject, there are 10 gait sequences consisting of 6 normal (i.e. covariate-free) gait sequences (CASIASetA), 2 carrying bag sequences

Page 8: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

(CASIASetB), and 2 wearing coat sequences (CASIASetC). The first 4 of the 6 normal gait sequences are used as the gallery set. The other 2 normal gait sequences (CASIASetA2) are considered as part of the probe set. The total probe set consists of 2 gait sequences from each of CASIASetA2, CASIASetB and CASIASetC. As a benchmark of comparison, the proposed methodology is compared to the standard gait

representation methods: GEI [12], GEnI [12], M Gij

[19], AEI [18], and GFI [20].

1) Evaluation of DFEI

After the dimensionality reduction using the LDA, the probe and gallery DFEI feature vectors are represented as a (C - 1) dimensional subspace (in the case of CASIA-B, it is 124 - 1). Following this, the classification of the probe DFEI is done by computing the distance of the probe DFEI feature vector to the gallery DFEI feature vectors and is assigned to the class for which the computed distance is minimum. Gait recognition carried out using the DFEI gait feature template with LDA and 1-NN classifier achieves 96% on the CASIASetA2 (normal), 69.3% on the CASIASetB (bag), and 59.7% on the CASIASetC (clothing) data sets, thus leading to 75.0% average accuracy (Table 1).

2) Evaluation of PRWDFGEI

The dimension reduction of PRWDFGEI is carried out using the LDA and GDA, and the classification is done using the 1-NN classifier.

a) Results with PRWDFGEI Using LDA

Table I shows the results obtained using the PRWDFGEI gait feature template and as a benchmark is compared to the other gait representation methods on the CASIA-B data set. In [31] with the GEI, the direct template matching (TM) method is used as a classifier without any dimensionality reduction,

whereas, for the GEI [12], GEnI [12], M Gij

[19], and GFI

[20], dimension reduction is done using canonical discriminant analysis (CDA) and, for the AEI [18], LDA is used along with the 1-NN classifier. Note that CDA is a special case of multiclass LDA and is considered as the best linear dimensional reduction method for gait-based individual identification [12, 19]. It can be seen in Table I that, when the test is performed on the CASIASetB data set consisting of carrying objects gait sequences, the proposed PRWDFGEI gait feature outperforms the other methods and achieves a better recognition rate of 87.1%. At the same time on the CASIASetC data set consisting of bulky coat gait sequences, the result for the PRWDFGEI gait feature template is comparable to the AEI. The average recognition rate of 80.6% for the PRWDFGEI gait feature template demonstrates its superiority over other methods in handling body-related covariate factors in comparison to the other methods.

TABLE I

PERFORMANCE COMPARISON OF THE PROPOSED GAIT FEATURE REPRESENTATIONS AND OTHER REPORTED METHODS TESTED ON THE

CASIA-B DATA SET.

Methodology

Data Set (%)CASIASetA2 CASIASet

BCASIASetC Avg

GEI + TM [31] 97.6 52.0 32.7 60.8

GEI + CDA [12] 99.4 60.2 30.0 63.2GEnI + CDA [12] 98.3 80.1 33.5 70.6AEI + LDA [18] 88.7 75.0 57.3 73.7M G

ij + CDA [19]

100 78.3 44.0 74.1

DFGEI + LDA 96.0 69.3 59.7 75.0GFI + CDA [20] 97.5 83.6 48.8 76.6

PRWGEI + LDA [11] 98.4 93.1 44.4 78.6PRWDFGEI + LDA 98.4 87.1 56.4 80.6PRWDFGEI + GDA 98.4 88.7 58.9 82.0

b) Results with PRWDFGEI Using GDA

To show the enhanced robustness of PRWDFGEI, we considered the same dimension reduction method and the data set used by others in the literature. So far, the proposed gait feature templates (PRWGEI, DFEI, and PRWDFGEI) used LDA for dimension reduction followed by the other gait

representation methods (GEI, GEnI, MG

ij , AEI, and GFI).

LDA is a linear projection method and CDA is a special case of multiclass LDA. To analyze if the recognition rate can be improved using a nonlinear projection method, GDA is applied as a nonlinear dimension reduction method. It can be seen in Table I that the PRWDFGEI with GDA has better recognition rate for carrying bag and clothing conditions in comparison to the PRWDFGEI with LDA. In addition to this, the average recognition rate using GDA has improved in comparison to that using LDA.

3) Statistical Significance of the Proposed Methods

A two-way analysis of variance (ANOVA) is applied to find the statistically significant evidence of a difference between our methods: M1 (PRWGEI + LDA), M2 (DFEI + LDA), M3 (PRWDFGEI + LDA), and M4 (PRWDFGEI + GDA). Based on the experimental set-up of the CASIA-B data set, each of the test sets (normal, different clothing, and carrying objects) contains two gait sequences. To find significant evidence of a difference between the proposed methods, six test cases (M1 and M2), (M1 and M3), (M1 and M4), (M2 and M3), (M2 and M4), and (M3 and M4) are considered. Table II summarizes the gait recognition results obtained using the methods M1 to M4.

ANOVA was carried out to find significance between the methods and the corresponding p values obtained are 0.0109 (M1 and M2), 0.0349 (M1 and M3), 0.0036 (M1 and M4), 0.0006 (M2 and M3), 0.0002 (M2 and M4), and 0.0433 (M3 and M4). For all the six tests performed, the corresponding p values are less than 0.05. Therefore, it can be concluded that methods M1, M2, M3, and M4 are significantly different.

TABLE II

RECOGNITION RATES (%) OBTAINED USING THE PROPOSED METHODS.

Page 9: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

Data set M1 M2 M3 M4CASIASetA

297.599.2

96.096.0

97.599.2

98.498.4

CASIASetB 93.592.7

71.0 67.6

86.387.9

87.989.5

CASIASetC 42.746.0

58.461.3

56.456.4

58.159.7

4) Additional Results with PRWDFGEI

The proposed method is based on an implicit assumption that the gallery images of each person are captured under normal condition (i.e. no bag or coat). To further test the robustness of the proposed PRWDFGEI feature, an experiment is conducted under a more realistic setting as advocated in [19]. Instead of considering four normal video sequences from the known class k, a set of four videos is considered as a combination of normal, carrying bag, and different clothing to test the robustness of PRWDFGEI approach in discriminating covariate conditions. Fig. 9 shows normal, carrying bag and and clothing sample images of an individual and it’s corresponding PRWDFGEIs. As seen in Fig. 9, very similar PRWDFGEI features are obtained in all conditions. This clearly illustrates the robustness of the PRWDFGEI in distinguishing the covariate conditions from the subject available in a training data set, containing a combination of normal, carrying bag, and different clothing condition videos. The proposed approach can work in uncontrolled environments, where the subjects may or may not appear with covariate factors.

Fig. 9: Images with normal, carrying bag, and different clothing of an individual and it’s corresponding PRWDFGEIs.

To verify the gait recognition performance in uncontrolled environment, the gallery set is selected using first one-third of the sequences from CASIASetC, the second one-third from CASIASetB, and the last one-third from CASIASetA. The probe sets consist of the rest of the data set and are referred to as CASIASetA3, CASIASetB2, and CASIASetC2. In this scenario, both the gallery and probe videos contain covariate factors; therefore, we need to remove the covariate factors and obtain the final PRWDFGEI for all the given videos. We have tested the CASIA-B data set under this uncontrolled set-up using the PRWDFGEI with the CDA and 1-NN classifier. Table III shows the obtained performance accuracy using the PRWDFGEI and CDA approach applied to the CASIA-B data set under uncontrolled set-up along with a comparison to the existing approach described in [19], which also uses covariates in the gallery data set. The results shown in Table III indicate that our proposed method outperformed the approach described in [19] for dealing with covariate conditions in the gallery set.

TABLE III

PERFORMANCE (%) OF PRWDFGEI ON THE CASIA-B DATA SET UNDER UNCONTROLLED SET-UP.

Data Set M Gij +CDA PRWDFGEI + CDA

CASIASetA3

69.1 78.3

CASIASetB2 55.6 65.4CASIASetC2 34.7 44.4

C.Evaluation on HumanID GAIT Data Set

To demonstrate the extensibility of the PRWDFGEI feature, a more realistic gait benchmark data set (HumanID data set) is used. The HumanID data set (version 2.1) consists of 122 subjects walking in an elliptical path captured under outdoor conditions. This consists of a range of covariate conditions, including carrying briefcase, surface, shoe, view, and time. For benchmarking purposes, as listed in Table IV, 12 experiment sets (A-L) have been designed for comparative performance evaluation with state-of-the-art algorithms. Thus, 1870 sequences of the 122 subjects are divided into one gallery set for training and 12 probes labeled from A to L for test. The dividing rule is based on five covariates: surface [C/G], camera position [L/R], shoe [A/B], carrying condition [NB/BF], and recording time [M/N]. The gallery for all of the experiments is (G, A, R, NB, M/N).

TABLE IV

PROBE SET OF HUMANID USF DATA SET (VERSION 2.1).

Set Probe Subjects

Difference

A (G, A, L, NB, M/N) 122 ViewB (G, B, R, NB, M/N) 54 ShoeC (G, B, L, NB, M/N) 54 Shoe, ViewD (C, A, R, NB, M/N) 121 SurfaceE (C, B, R, NB, M/N) 60 Surface, ShoeF (C, A, L, NB, M/N) 121 Surface, ViewG (C, B, L, NB, M/N) 60 Surface, Shoe, ViewH (G, A, R, BF, M/N) 120 BriefcaseI (G, B, R, BF, M/N) 60 Shoe, BriefcaseJ (G, A, L, BF, M/N) 120 View, BriefcaseK (G, A/B, R, NB, N) 33 Time, Shoe, ClothingL (C, A/B, R, NB, N) 33 Surface, Time, Shoe,

Clothing

The data set provides the silhouettes extracted from the video sequences, which are used to compute the PRWDFGEI. The HumanID data set is captured in an elliptical view different from the CASIA-B data set, where the data were captured in the frontoparallel view. Therefore, in the resulting videos at different instances, the captured view of the walking person will be different. To calculate the final PRWDFGEI in these kinds of scenarios, we consider multiple PRWDFGEIs from all the calculated gait cycles in a given video. Following this, a final PRWDFGEI is calculated by averaging all the PRWDFGEIs obtained from the gait cycles in the given video.

1) Results Using PRWDFGEI on HumanID GAIT Data Set

To test the robustness of the PRWDFGEI feature against the features described in the existing literature, the similar

Page 10: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

dimensional reduction and classification approaches proposed in the other methods for this data set are used. Therefore, CDA is chosen for dimensional reduction and 1-NN is selected as the classifier for the recognition process. Table V shows the performance of the PRWDFGEI feature using CDA and 1-NN classifier on the HumanID data set along with the percentage accuracy obtained for other gait representation methods described in the literature. Based on the results in Table V, the PRWDFGEI feature has the best performance in comparison to all the state-of-the-art methods tested on the HumanID data set for the experiments J to L (marked bold in Table V), and the second best performance for the other probe set such as H and I. Although our proposed PRWDFGEI feature does not outperform the Gabor-PDF feature [38], the results obtained for the experiments J to L is quite noteworthy. These results are significant in what we can attain from the work presented in this paper. As can be seen in Table V, the PRWDFGEI feature obtains the best results: 72%, 20%, and 18% for the experiments J to L, respectively, which directly involve clothing and carrying bag covariates. For experiments B, C, E, G, I, K, and L, our method has nearly equivalent accuracy in comparison to the Gabor-PDF feature, where shoe can be identified as a common covariate factor in these experiments. Therefore, in general, our proposed method performed well for the body-related covariate factors.

TABLE VRECOGNITION RATE (%) OF PRWDFGEI ON THE HUMANID GAIT DATA SET.

SetBaseline

[8]MSCT + SST [2]

GEI [35]

HMM [34]

GEnI [12] PRWDFGEI

Gabor-PDF [38]

A 73 80 89 89 89 90 90B 78 89 87 88 89 89 91C 48 72 78 68 80 84 85D 32 14 36 35 30 38 53E 22 10 38 28 38 42 52F 17 10 20 15 20 22 32G 17 13 28 21 22 28 28H 61 49 62 85 82 90* 92I 57 43 59 80 63 80* 86J 36 30 59 58 66 72 64K 3 39 3 17 6 20 12L 3 9 6 15 9 18 15

Avg 40.9 38.3 50.1 53.5 53.5 56.1* 58.0

For the scenarios such as airport security and other secure premises, covariates such as surface may not significantly change. However, shoe, clothing, and carrying bag may always change and cannot be controlled. This is additionally supported by the findings in [15, 16], which report that clothing and carrying bag covariates significantly reduce the prediction accuracy. In such scenarios, the obtained results for the PRWDFGEI feature are quite significant. Thus, in general, we would expect that the PRWDFGEI feature would be able to robustly handle body-related covariate factors. Additionally, the work presented in [32] that describes enhanced Gabors (EGG1, EGG2, and EGG3) with regularized local tensor discriminant analysis of gait images has lower accuracy for experiments (J-L) in comparison to our PRWDFGEI feature (Table VI).

TABLE VI

RECOGNITION RATE FOR EXPERIMENTS J TO L FROM HUMANID GAIT DATA SET USING PRWDFGEI AND EGG METHODS.

MethodsExperiment (% accuracy)

J K LGabor-PDF

[33] 64 12 15

EGG1 [32] 63 9 14EGG2 [32] 70 8 17EGG3 [32] 72 11 17PRWDFGEI 72 20 18

V. DISCUSSION

The main objective of this paper was to increase the gait recognition rate on clothing and carrying objects by developing appropriate methodologies that effectively handle body-related covariate factors while retaining the significant static body parts along with dynamic parts. The thickness characteristic of the RW-based method is discriminative in segregating the covariates attached to the subject, whereas the DFT-based approach is able to represent the motion intensity effectively. The proposed PRWDFGEI achieves the best results than the other methods developed for handling body-related covariates on both CASIA and HumanID data sets. As discussed earlier in Section II, clothing and carrying bag significantly reduce recognition rate, and removing such covariates is crucially important for improving recognition performance. The results obtained for the proposed PRWDFGEI are significant for the video-based person recognition system intended for security and surveillance purposes. The results for the PRWDFGEI feature on the challenging HumanID data set are comparable to other methods, considering that our aim was to reduce body-related covariate factors only. In the HumanID data set, there is a huge influence of shadow on the concrete surface. The PRWDFGEI feature approach successfully identified and removed the shadow area. This additional advantage makes our PRWDFGEI feature more robust in handling covariate factors and working in more realistic environments. With respect to GEI, which is a commonly used for appearance-based methods, the PRWDFGEI has better performance. It can be seen in Table I that GEI with CDA (a multiclass LDA) has lower average performance in comparison to the PRWDFGEI feature with LDA on the CASIA-B data set. Additionally, on the HumanID data set where both GEI and PRWDFGEI are reduced in dimension using CDA, the PRWDFGEI gait feature has higher recognition accuracy in comparison to the GEI (Table V).

It is possible to acquire normal video sequences for training purposes in a controlled environment. However, in an uncontrolled environment, it cannot be always assumed that normal sequences would be available. In such instances, it may be required to consider the video sequences affected by covariate factors for training purposes. To show the enhanced robustness of the PRWDFGEI feature in uncontrolled environment, a mixture of normal, clothing, and carrying bag conditions have been considered for training. The results in Table III demonstrate that, in comparison to the M G

ij feature,

the PRWDFGEI feature provides superior performance

Page 11: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

uncontrolled environment. This additional capability of the PRWDFGEI feature makes it suitable for usage in an uncooperative environment as well.

VI. CONCLUSION AND FUTURE WORK

We have proposed a novel robust gait feature representation (i.e. PRWDFGEI) using the fusion of covariate-free static and dynamic parts of gait information. It uses RW to account for the changing appearances by extracting the static body parts combined with DFT to retain the dynamic characteristics of the gait. It has been demonstrated that the effect of body-related covariate factors could be efficiently reduced using the RW-based approach by retaining the discriminative head portion of the subject and representing well the dynamic information of the gait. In conclusion, the work in this paper presents a novel gait representation method to handle varying human appearances.

Note that the contextual information (such as coat, hat, and carrying objects perceived as specific behavioral activity) may also be relevant in gait recognition as discussed in [36]. However, the PRWDFGEI aims at removing only covariate factors. In [36], contextual information such as time, location, and carrying condition are used to profile individuals to perform a context-based gait recognition (CGR). However, the CGR is limited in scope, as it is dependent on the contextual parameters. Using the CGR approach, the recognition of an unknown individual carrying anything other than things stored in context database would be difficult. In contrast, the developed PRWDFGEI is targeted for surveillance applications in uncontrolled environments, where the subject may be at a distance and the contextual information may not always be possible to extract from low-resolution videos. Therefore, the PRWDFGEI feature is beneficial in removing any contextual information and extracting only gait information that is significant for accurate recognition in noncontextual scenario.

With growing concerns about security, such results are significant in identifying individuals who are potential threats and disguise their identities. The PRWDFGEI feature handles the clothing covariate more effectively in comparison to the existing state-of-the-art methods. In this paper, we propose a methodology for a security and surveillance-based system intended to be installed in security critical areas, where the user is monitored and access is provided on correct identification. The work presented here can be summarized as to increase the robustness of a survelliance system that works for the following scenario, although not limited to that outlined in Fig.10 whereby an employee in a security critical area in order to gain access requires identification using a behavioral biometric (i.e. correct match of PRWDFGEI feature of the said user to that stored in the database). If the system wanted more information then it might ask the person to follow more process for further authentication.

Fig. 10: Smart surveillance door operated through behavioural biometrics.

Although the PRWDFGEI is effective in representing covariate-free static features and retaining the dynamic information of the gait effectively, it does have limitations. In the current work, the PRWsil is extracted from the binary silhouettes using a Gaussian model-based background estimation [23]. The effectiveness of the silhouette-based approaches is overly dependent on background subtraction, and with varying background, the Gaussian model-based background estimation is not very effective. As a consequence, the PRWDFGEI feature is limited by the background subtraction algorithm used. A novel incremental framework based on optical flow is proposed in [37] for extracting silhouettes from a noisy background. In our future work, we would consider this framework for binary silhouette extraction. Additionally, the covariates handled by the PRWDFGEI feature are mostly body related. However, there are other external covariate factors such as walking surface and different viewing angle. The problem of externally associated covariate factors can be considered as future work to improve gait-based person recognition. In the PRWDFGEI feature, the motion intensity is represented using the frequency component. The walking speed of an individual is addressed in [38]. It would be interesting to incorporate varying speed information in the PRWDFGEI, and we would consider it in our future work.

REFERENCES

[1] Y. Pratheepan, P. Chaurasia, J. Condell, G. Prasad, Enhancing gait based person identification using joint sparsity model and ℓ1-norm minimization, Information Science 308 (2015) 3–22.

[2] T. Lam, R. Lee, D. Zhang, Human gait recognition by the fusion of motion and static spatio-temporal templates, Pattern Recognition 40 (2007) 2563–2573.

[3] Y. Liu, R.T. Collins, Y. Tsin, Gait sequence analysis using frieze patterns, in: Proceedings European Conference Computer Vision, vol. 2, 2002, pp. 657–671.

[4] W. Kusakunniran, Recognizing gaits on spatio-temporal feature domain, IEEE Transactions on Information Forensics and Security, 9 (2014) 1416–1423.

[5] L. Wang, T.N. Tan, W.M. Hu, H.Z. Ning, Automatic gait recognition based on statistical shape analysis, IEEE Transactions on Image Processing 12 (2003) 1120–1131.

[6] G.V. Veres, M.S. Nixon, J.N. Carter, Modelling the time-variant covariates for gait recognition, in: Proceedings of 5th International Conference on Audio and Video-Based Biometric Person Authentication, 2005, pp. 597–606.

Page 12: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

[7] J.E. Cutting, D. Proffitt, Gait perception as an example of how we may perceive events, Intersensory Perception and Sensory Integration 2 (1981) 249–273.

[8] S. Sarkar, P.J. Phillips, Z. Liu, I.R. Vega, P. Grother, K.W. Bowyer, The Human ID gait challenge problem: data sets, performance, and analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (2005) 162–177.

[9] M.R. Aqmar, Y. Fujihara, Y. Makihara, Y. Yagi, Gait recognition by fluctuations, Computer Vision and Image Understanding, 126 (2014) 38–52.

[10] M. Altab Hossain, Y. Makihara, J. Wang, Y. Yagi. Clothing-invariant gait identification using part-based clothing categorization and adaptive weight control, Pattern Recognition 43 (2010) 2281–2291.

[11] Y. Pratheepan, J. Condell, G. Prasad, PRWGEI: Poisson random walk based gait recognition, in: Proceedings of 7th International Symposium on Image and Signal Processing and Analysis (ISPA), 2011, pp. 662–667.

[12] K. Bashir, T. Xiang, S. Gong, Gait recognition using gait entropy image, in: Proceedings of 3rd International Conference on Crime Detection and Prevention, 2009, pp. 1–6.

[13] CASIA Gait Database. http://http://www.cbsr.ia.ac.cn/english/Gait Databases.asp, online, September 2009.

[14] J.E. Cutting, D. Proffitt, Gait perception as an example of how we may perceive events, Intersensory Perception and Sensory Integration 2 (1981) 249–273.

[15] D. Matovski, M. Nixon, S. Mahmoodi, J. Carter, The effect of time on gait recognition performance, IEEE Transactions on Information Forensics and Security 7(2) (2012) 543–552.

[16] I. Bouchrika, M. Nixon, Exploratory factor analysis of gait recognition, in: 8th IEEE International Conference on Automatic Face and Gesture Recognition, 2008, pp. 1–6.

[17] X. Yang, Y. Zhou, T. Zhang, G. Shu, J. Yang, Gait recognition based on dynamic region analysis, Signal Processing 88 (2008) 2350–2356.

[18] E. Zhang, Y. Zhao, W. Xiong, Active energy image plus 2dlpp for gait recognition, Signal Processing 90(7) (2010) 2295–2302.

[19] K. Bashir, T. Xiang, S. Gong, Gait recognition without subject cooperation, Pattern Recognition Letters 31(13) (2010) 2052–2060.

[20] K. Bashir, T. Xiang, S. Gong, Gait representation using flow fields, in: Proceedings of the British Machine Vision Conference, 2009, pp. 1–11.

[21] X. Li, S. J. Maybank, S. Yan, D. Tao, D. Xu. Gait components and their application to gender recognition, IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews 38(2) (2008), 145–155.

[22] B.I. Bertenthal, J. Pinto, Complementary processes in the perception and production of human movements, in: Smith, A Dynamic Systems Approach to Development: Applications. MIT Press, Cambridge, MA, 1993, pp. 209–239.

[23] C. Stauffer, W.E.L. Grimson, Adaptive background mixture models for real-time tracking, in: Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, 1999, pp. 246–252.

[24] L. Gorelick, M. Galun, E. Sharon, R. Basri, A. Brandt. Shape representation and classification using the Poisson equation, IEEE Transactions on PAMI 28(12) (2006), 1991–2005.

[25] K.W. Morton, D.F. Mayers, Numerical Solution of Partial Differential Equations, 2nd ed., Cambridge University Press, Cambridge, 2005.

[26] Y. Makihara, R. Sagawa, Y. Mukaigawa, T. Echigo, Y. Yagi, Gait recognition using a view transformation model in the frequency domain, in: Proceeding of the 9th European Conf. on Computer Vision, vol. 3, 2006, pp. 151–163.

[27] K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd ed., Academic Press, 1990.

[28] G. Baudat, F. Anouar, Generalized discriminant analysis using a kernel approach, Neural Computation 12 (2000) 2385–2404.

[29] D. Tao, X. Li, X. Wu, S. Maybank, General tensor discriminant analysis and Gabor features for gait recognition, IEEE Transactions on PAMI, 29(10) (2007) 1700–1715.

[30] M. Turk, A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuroscience 3(1) (1991) 71–86.

[31] S. Yu, D. Tan, T. Tan, A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition: in: Proceedings of ICPR, 2006, pp. 441–444.

[32] H. Hu, Enhanced Gabor feature based classification using a regularized locally tensor discriminant model for multiview gait recognition, Circuits and Systems for Video Technology, IEEE Transactions on Circuits and Systems for Video Technology 23(7) (2013) 1274–1286.

[33] D. Xu, Y. Huang, Z. Zeng, X. Xu, Human gait recognition using patch distribution feature and locality-constrained group sparse representation, IEEE Transactions on Image Processing 21 (2012) 316–326.

[34] A. Kale, A. Sundaresan, A.N. Rajagopalan, N.P. Cuntoor, A.K.R. Chowdhury, V. Kruger, R. Chellappa, Identification of humans using gait, IEEE Transactions on Image Process 13(9) (2004) 1163–1173.

[35] J. Han, B. Bhanu, Individual recognition using gait energy image, IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (2006) 316–322.

[36] S. Bazazian, M. Gavrilova, Context based gait recognition, in: SPIE Defense, Security, and Sensing, International Society for Optics and Photonics, (2012) pp. 84070J-84070J.

[37] M. Hu, Y. Wang, Z. Zhang, D. Zhang, J. Little, Incremental learning for video-based gait recognition with LBP flow, IEEE Transactions on Cybernetics 43(1) (2013) 77–89.

[38] W. Kusakunniran, Q. Wu, J. Zhang, H. Li, Gait recognition across various walking speeds using higher order shape configuration based on a differential composition model, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 42(6) (2012) 1654–1668.

Priyanka Chaurasia (M’16) received a BTech degree in information technology from the Harcourt Butler Technical University, Kanpur, India, in 2006 and a PhD in computing and information engineering from Ulster University, Coleraine, UK, in 2013. Currently, she is a research associate in computer science at the Connected Health Innovation Centre, School of Computing & Maths, Ulster University, Jordanstown, UK. Before her Ph.D., she was with IBM India Software Labs, Bangalore, India, for 3 years as a software engineer. She has patents granted by the U.S. Patent and Trademark Office in the area of signature verification and is a recipient of the IBM Invention Achievement Award in 2008. Her research interests are assistive technology, activity recognition, biometric security, and data analytics.

Pratheepan Yogarajah (M’10) obtained a first-class honours degree in computer science from the University of Jaffna, Sri Lanka, in 2001, and a MPhil degree in computer vision from the Oxford Brookes University, UK, in 2006. He obtained his Ph.D. from the Ulster University, UK, in 2015. Currently, he is a lecturer in computer science at the School of Computing & Intelligent Systems, Ulster University, Derry~Londonderry, UK. He is a recipient of the Oxford Brookes University HMGCC Scholarship Award in 2005. He is also

a co-recipient of Proof of Principle Award from Ulster University in 2012 and Proof of Concept from Invest Northern Ireland (Invest NI) in 2013. His research interests include biometrics, computer vision, image processing, steganography and digital watermarking, robotics, and machine learning.

Joan Condell received her Ph.D. in computing and math from the Ulster University, Northern Ireland, in 2002, with the research theme in motion tracking in digital images. She is currently a senior lecturer in the School of Computing & Intelligent Systems, Ulster University, Derry~Londonderry, UK. Her achievements include winning and managing European Framework projects and commercialization projects. She has patents filed and has published more than 130 papers in

Page 13: INTRODUCTIONuir.ulster.ac.uk/38099/1/PRW_DFT_paper_accpetedVersion.docx · Web viewRecent interests shown in developing more robust gait-based identification systems for surveillance

international conferences, books, and journals. Her research interests include steganography, image processing, biometrics and vision for robotics, and multimedia.

Girijesh Prasad (M’98-SM’07) received a BTech degree in electrical engineering from REC (now NIT) Calicut, India in 1987, an MTech degree in computer science & technology from University of Roorkee (now IIT) Roorkee, India in 1992, and a Ph.D. from Queen’s University, Belfast, UK, in 1997. Currently, he is a professor of intelligent systems at the School of Computing & Intelligent Systems, Ulster University, Derry~Londonderry, UK. As an executive

member of Intelligent Systems Research Centre at Ulster, he leads the Neural Systems and Neuro-technology team. He is the Director of Northern Ireland Functional Brain Mapping facility for MEG studies. His research interests are in computational intelligence, brain-computer interfaces and neuro-rehabilitation, assistive technology, and intelligent surveillance systems.