13
Int. J. Nonlinear Anal. Appl. 13 (2022) 1, 1225-1237 ISSN: 2008-6822 (electronic) http://dx.doi.org/10.22075/ijnaa.2022.5673 Hybrid deep learning framework for human activity recognition S. Pushpalatha a, , Shrishail Math b a Research Scholar, Dept. of ISE, Dr.Ambedkar Institute of Technology, Bengaluru, India b Professor, Dept. of CSE, Shri Krishna Institute of Technology, Bengaluru, India (Communicated by Madjid Eshaghi Gordji) Abstract The aim of the recognition in the human activity is to recognize the actions of the individuals using a set of observations and their environmental conditions. Since last two decades, the research on this Human Activity Recognition (HAR) has captured the attention of several computer science communities because of the strength to provide support to different applications and the connection to different fields of study such as, human-computer interaction, healthcare, monitoring, entertainment and education. There are many existing methods like deep learning which have been used to develop to recognize the different activities of the human, but couldn’t identify the sudden change of the activities in the human. This paper presents a method using the deep learning methods which can recognize the specific identities and identify a change from one activity to another for the applications of the healthcare. In this method, a deep convolutional neural network is built using which the features are extracted for the collection of the data from the sensors. After which the Gated Recurrent Unit (GRU) captures the long-tern dependency between the different actions which helps to improve the identification rate of the HAR. From the CNN and GRU, a model of wearable sensor can be proposed which can identify the changes of the activities and can accurately recognize these activities. Experiment have been conducted using open-source University of California (UCI) HAR dataset which composed of six different activity such as lying, standing, sitting, walking downstairs, walking upstairs and walking. The CNN-based model achieves a detection accuracy of 95.99% whereas the CNN-GRU model achieves a detection accuracy of 96.79% which is better than most existing HAR methods. Keywords: Deep learning; Convolutional neural networks; Activity recognition; Gated recurrent unit Email addresses: [email protected] (S. Pushpalatha), [email protected] (Shrishail Math) Received: June 2021 Accepted: September 2021

Int. J. Nonlinear Anal. Appl. 13 (2022) 1, 1225-1237 ISSN

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Int. J. Nonlinear Anal. Appl. 13 (2022) 1, 1225-1237ISSN: 2008-6822 (electronic)http://dx.doi.org/10.22075/ijnaa.2022.5673

Hybrid deep learning framework for human activityrecognition

S. Pushpalathaa,, Shrishail Mathb

aResearch Scholar, Dept. of ISE, Dr.Ambedkar Institute of Technology, Bengaluru, IndiabProfessor, Dept. of CSE, Shri Krishna Institute of Technology, Bengaluru, India

(Communicated by Madjid Eshaghi Gordji)

Abstract

The aim of the recognition in the human activity is to recognize the actions of the individualsusing a set of observations and their environmental conditions. Since last two decades, the researchon this Human Activity Recognition (HAR) has captured the attention of several computer sciencecommunities because of the strength to provide support to different applications and the connection todifferent fields of study such as, human-computer interaction, healthcare, monitoring, entertainmentand education. There are many existing methods like deep learning which have been used to developto recognize the different activities of the human, but couldn’t identify the sudden change of theactivities in the human. This paper presents a method using the deep learning methods which canrecognize the specific identities and identify a change from one activity to another for the applicationsof the healthcare. In this method, a deep convolutional neural network is built using which thefeatures are extracted for the collection of the data from the sensors. After which the Gated RecurrentUnit (GRU) captures the long-tern dependency between the different actions which helps to improvethe identification rate of the HAR. From the CNN and GRU, a model of wearable sensor can beproposed which can identify the changes of the activities and can accurately recognize these activities.Experiment have been conducted using open-source University of California (UCI) HAR datasetwhich composed of six different activity such as lying, standing, sitting, walking downstairs, walkingupstairs and walking. The CNN-based model achieves a detection accuracy of 95.99% whereas theCNN-GRU model achieves a detection accuracy of 96.79% which is better than most existing HARmethods.

Keywords: Deep learning; Convolutional neural networks; Activity recognition; Gated recurrentunit

Email addresses: [email protected] (S. Pushpalatha), [email protected] (Shrishail Math)

Received: June 2021 Accepted: September 2021

1226 Pushpalatha, Math

1. Introduction

The Human Behavior Recognition (HAR) detects, interprets and recognizes the behaviors of thehuman which can be used in the field of the smart healthcare to help the doctors or the patientsaccording to their different requirements. The HAR is used in many fields using different kind ofapplications such as the monitoring system in home automation, different kind of sensors in sportsto detect the movement of the players, and using the gyro sensors to identify the various movementsof the body. HAR has a huge impact in the research field [17] and helps to make the day-to-day lifeeasier, safer and convenient.Presently, in the real world the data of the human data can be extracted using two ways. The firstway is to rely on the field of computer vision and the second way is to get the data using the differentsensors [16]. The study of recognition of the human behavior in computer vision field has been sincea long time and each study has different approach to their own model. Though, using the computervision methods there are many restrictions faced during the modelling of the model. The studieson the sensors which help to recognize the human behavior has been in process and helps us to usethe sensor technologies to detect the behavior of the human body. There are different sensors usedfor different purposes such as gyroscopes which detect the angular velocity, accelerometers whichmeasure the acceleration, magnetometers which helps to calculate the magnetic field intensity andthe barometer sensor which detects the atmosphere pressure. All of these sensors can be used inthe mobile phone, smart watches and even the clothes. The wearable sensors have resolved theproblem of the electromagnetic shielding, like in [? ], which evaluates the velocity and accelerationaccurately of the motion sensors in the actual time even in the existence of the electromagneticshielding. The wearable sensors have small size, high sensitivity, and has an ability to reduce theelectronic interference during the radio transmissions, so that it can be used efficiently in day- to-daylife without any harm to the environment. Furthermore, the recognition of the human behavior usingthe sensor technology is currently being used everywhere which help to detect the human activitiesmore efficiently. Hence, the study on the different applications regarding the HAR using the sensorshas more importance and significance.There are mainly two types of actions in the HAR: the different actions and the changes in the actions.As the changes of the human activities are unpredictable, there are very few research in the changesof the movement of the human activities such as suddenly changing from the standing position to thesitting position, or from walking position to the standing position. The research in the changes ofthe movement of the human activities is a very significant aspect in the HAR. The evaluation of thechanges in the movement can help to improve the accuracy rate in the human activity recognition.The changes in the action can be identified using the basic actions. The recognition rate can beincreased by dividing the changes in the actions in the streaming data to some limit so it gives us animproved accuracy. There are some limitations because the features of the behaviour are extractedmanually from the traditional methods. The different type of applications and the development inthe field of deep learning, the deep learning-based model has a benefit in the HAR.This model uses both the accelerometer sensor and gyro sensor data using a CNN-GRU model toidentify the changes in the action. The Convolution Neural Network (CNN) is a neural networkwhich is used for the extraction of the features from the data taken using the sensors. The changesare categorized using the local requirement, so it can have best performance during the extraction ofthe features. The movement of the human body composes of complex actions and movements whichchange along with the time. Due to the complex action and movement, the CNN method does notextract the features properly in a given time. Hence the Gated Recurrent Unit (GRU) is used toextract the data in a time dependent sequence problem. Thus, the CNN-GRU combined can identify

Hybrid deep learning framework for human activity recognition 1227

the actions and change in the actions accurately in a time-series data. The proposed model of theCNN-GRU can be summarized as follows:

� The model consists of the Convolution Neural Network (CNN) and the Gated Recurrent Unit(GRU), which learns the different actions and identifies the change in the action using thefeatures and models the time-dependence for the features in a noisy environment.

� The CNN-GRU based model can be used to solve both classification and as well as regressionproblem. However, here the model is tested for solving classification problems.

� The impact of the important parameters in a deep learning-based model is discussed and thebest parameter for our CNN-GRU model is determined.

� The results were evaluated and compared using different models that have used the samedataset. The evaluation of the results shows that the CNN-GRU model is more efficient whencompared to the existing models.

The paper is arranged is the following way: The Section 2 explains the literature survey done onthe various models of the HAR. The Section 3 gives the HAR model which explains the recordingsof the motion signals in the human behavior, normalization of the signal, measurement of the data,format, and finally the CNN-GRU proposed model. In the Section 4 the results have been comparedwith the existing systems and in the last section i.e. Section 5 the conclusion along with the futurework is given.

2. Literature Survey

As there is fast development in the field of the computer vision and the embedded systems,the HAR using the wearable device having the sensors has become an essential in the day-to-daylife and is currently being used in various fields such as action recognition, health management,remote control, medical monitoring and rehabilitation activities [5, 23, 3, 12, 14, 18, 10, 27, 2, 25].Various wearable devices embedded with sensor are in the development for the usage in the day-to-day life and in sport activities. The benefit of these sensors is that they provide the monitoringof the motion and recognition without the use of any external sensor like camera, radar or infraredsensors [29, 9, 28]. As the sensors are small in size, have light weight, can be bought at low costand consume very less power these can be used in sport activities for the recognition of the motionof the players. The KEH can help to resolve the issue of the power consumption of the batteryin the wearable devices. Using the KEH, the power consumption of the battery in the wearabledevices in HAR can be reduced to 79% [14]. The data of the common activities were recordedusing the professional equipment’s which are expensive and inconvenient [17]. Nowadays, the mobiledevices have made a huge impact on the people’s day-to-day life. Many mobile devices offer differentfunctionalities other than the basic functions such as basic telephone and different sensors are usedfor other functionalities. Many efforts have been made to use the mobile devices for the HAR[21, 6]. Hence a lot of studies have been conducted mainly on the mobile phone sensors to get thedata regarding the human activities. The dataset of the human activities was classified using themobile devices [20]. A method has been proposed using the histogram gradients and Fourier formfor the extraction of the human behaviour, and then used this method on the UCI human activityrecognition database. This method was build using the supervised machine learning algorithm i.e.,SVM and KNN algorithm for the classification [11]. The performance of the model gives higheraccuracy throughout the extraction of the feature. A model has been proposed for the military

1228 Pushpalatha, Math

wearable devices using the multi-layer perceptron to classify the different activities of the individual.The design of the model consumed less time and power during the execution. The model used theUCI HAR dataset for the development and verification [7]. A recognition model has been proposedusing the accelerometer and the gyro sensor to recognize the activities. In this model, a three-layerLSTM has been proposed where the accuracy reached at 93.7% using the UCI HAR dataset and withadditional dataset it got an accuracy of 97.4% [26]. For the recognition of the activity, a method hasbeen proposed using the accelerometer in the mobile phones. In this model, the data was extractedusing the accelerometer from the original input signal. The model uses the PCA to minimize thefeatures and mainly extract the required features of the different activities of the human in a giventime and frequency for the categorization of the six activities. This method used 70% of the datafor training having 6.11% accuracy and 30% of the data for the testing having 92.10% accuracy[24]. During the pre-processing, the extraction of the features required special method to retrievethe data from different database, and small changes in the deep learning method, the features canbe retrieved, but there will be a reduction in the accuracy. A method in which the UCI humaninteraction recognition database was used for the verification of their proposed model and for theclassification of the dataset using their model. There are currently many studies being done in thefield of deep learning in HAR. As the machine learning algorithms require manual extraction of thefeatures from the dataset, the deep learning algorithms can automatically extract the features fromthe dataset. Hence, the data were extracted using the sensors, the outcomes were evaluated and anefficient model of HAR was developed. A system using LSTM model was developed to detect andidentify different activities from the dataset. The number of dimensions were reduced using the PCAmethod and achieved an accuracy of 97.64% [1]. A method using the three-axis acceleration sensorsin the mobile devices were collected and using the 1D CNN method the model was proposed. Thethree-axis accelerometer data was used as an input for the neural network and achieved an accuracyrate of 92.7% [15]. Another method using the values of the gyro sensor and accelerometer taken asan input for the extraction of the features automatically through the CNN was proposed. The modelwas compared using the SVM model and multi-layer perceptron method. The CNN model achieveda higher accuracy and computational complexity [31]. A method that extracted the data using the3D raw accelerometer using the CNN algorithm without the pre-processing of data was proposed.The model achieved an accuracy of 91.91% and a 9% increase to the traditional SVM model of HAR[34]. A method using the CNN model and the dataset of HAR has been used through which the datahas been extracted and evaluated and an accuracy of 95.4% has been achieved [33]. Another modelof deep CNN has been used for the classification of the basic human activities for an extensive time.The model uses the raw data of the gyroscope and accelerometer from the wearable device as aninput and this model achieved an accuracy of 96.4% [30]. A model of deep neural network along withthe convolutional layer and LSTM was developed. This model extracted the features automaticallyand classified the extracted features of the three datasets (OPPORTUNITY, UCI and WISDM) andachieved an accuracy of 92.63%, 95.78% and 95.85% respectively [13]. A model was developed usinga dataset of seven individuals who performed the daily activities like walking, sitting and standingwhich achieved an accuracy rate of 82% [4]. After going through different studies, we can concludethat using the deep learning algorithms, the neural networks extract the features automatically. Asthe neural networks extract the features automatically, the CNN-GRU method is proposed as theCNN helps to extract the features easily and it can be done automatically. The GRU method helpsto identify/detect the changes of the different activities from one activity to another activity.

Hybrid deep learning framework for human activity recognition 1229

Figure 1: Architecture of Noisy Measurement Aware Human Activation Recognition model using Hybrid Deep LearningFramework.

3. Noise Measurement Aware Human Activity Recognition Model using Hybrid Frame-work

This section presents a hybrid framework for human activity recognition. First system modelis discussed. The architecture of noise measurement aware human activity recognition model usinghybrid deep learning framework is shown in Figure 1.

3.1. System Model

In CNN, the layers are divided as follows such as solitary cumulated convolution subnetwork(SCCSN) and distinct convolution subnetwork (DCSN). In the DCSN the network of all the incom-ing sensor tensor is given using the Xk, and in the SCCSN the output of the incoming sensor networkis given using the distinct convolution subnetwork.As the design of the DCSN for all the sensors is same, the main focus is on the DCSN having theinput tensor Xk where the Xk is a dk ∗ 2f ∗ T tensor, where dk is the measurement of the sensorattributes, f represent the attribute of the frequencies and T is the session period.In each of the session period t, the model Xk

..t will be used in the convolution neural network ar-chitecture which has three layers in this model. There are two sort of elements/associations insidethe Xk

..t which need to be extracted i.e., the association of the frequencies and the association ofthe measurement of the sensor attributes. The frequencies have a greater number of local patternsin some of the neighbouring areas of the frequencies. The interaction between the measurement ofthe sensor having all the attributes is done. Hence, the 2D filters having the shape dk, cov1 to Xk

..t

is applied first to study the interaction between the measurement of the sensor attributes and thelocal patterns in the frequencies, having the output as X

(k,1)..t . After this the 1D filters using shape

(1, cov2) and (1, cov3) categorically is applied to study the high-level associations, X(k,2)..t and X

(k,3)..t .

The matrix X(k,3)..t is flattened into the given vector x

(k,3)..t and then the concatenation of all the K

vector {x(k,3)..t } into the K-tuple X3

..t is done, which is the data of the merged convolutional subnet-work. The framework of the merged convolution subnetwork is same to the DCSN. Here, the 2Dfilters with the shape (K, cov4) is applied first to study the connections between al the K sensors,with the output X4

..t. After this the 1D filters having the shape (1, cov5) and (1, cov6) categorically

1230 Pushpalatha, Math

is applied to study the high-level associations, X5..t and X6

..t.In all of the CNN layer, our model CNN-GRU studies 64 filters, and then utilizes the ReLU methodfor activating the function. In this model, each layer uses the batch normalization method [20] tominimize the covariance shift. The residual network structure [28], is not used as we wanted to sim-plify the framework of the network for all the mobile applications. The final output X6

..t is flattened

into the vector x(f)..t ; concatenate x

(f)..t and session period width, [τ ], together into x

(c)t as an input to

the recurrent layer.

3.2. Recurrent Layers

RNN have a powerful framework which can be trained using the features for the given classifi-cations that can estimate the function. In RNN, there are two models, the GRU model [8] and theLSTM model [22] which have been used in our model. The Gated Recurrent Unit model is used be-cause it shows some similar properties and performances when compared with the Long-Short TermMemory model [8]. The GRU model has more short expressions, which minimizes the complexity ofthe network in the mobile application. The architecture of HAR uses a GRU model which has twolayers. According to the bi-directional Gated Recurrent Units [32], it contains two session periodsin which the time period starts and goes to the end and the same happens again from the end itcomebacks to the start. This process runs continuously when there is a new session period which hasmore faster processing of the required stream data. Since the bi-directional Gated Recurrent Unitcannot be used until all the information of the session period is ready, which is unfeasible for themobile applications. Inputs {x(c)

t } for t = 1, ..., T from the earlier CNN layer is used as an input to

the Gated Recurrent unit to get the output x(r)t for t = 1, ..., T as an input for the output of the final

layer.

3.3. Output Layer

The yield of the RNN layer contains a sequence of vector {x(r)t } for t = 1, ..., T .

In the task of regression-oriented, the value of all the required vector x(r)t is in between ±1, x

(r)t which

encode the output at the end of the session period t.Suppose, in the output layer if we want to study the dictionary Wout using a prejudice term bout tointerpret x

(r)t into yt, such that yt = Wout. x

(r)t + bout, then the layer contains a fully linked layer in

which the top of each session has sharing parameter Wout and bout.The classified task, x

(r)t is the vector at the session period t. The layer of the output first requires

to comprise x(r)t into a given length vector for the additional processing. The weights of the neural

network are learnt by using the function r. The average features are used to generate a final feature,

x(r) =∑T

t=1 x(r)t

T. Finally, the x(r) is taken as an input into the SoftMax layer for the prediction of the

probability y.

3.4. General customization process

In this section, some parameters of the HAR model which are used for computing different taskand for mobile sensing are customized. The process of the customization is given as follows:

1. Identification of the data from a number of sensors, K. Pre-processing of the data of the sensorsand make a set of it using X = {Xk} as an input data.

2. Identification of the given task and classify it into what kind of application it is (the applicationcan be a classification oriented or it can be a regression application). Selection of the outputlayer according to the given task.

Hybrid deep learning framework for human activity recognition 1231

3. Designing a modified cost function or a default cost function.

Hence, for the HAR model configuration, the number of inputs is set using the K, then thepre-processing on the input measurement of the sensor is done and then finally the identification ofwhat kind of task is being performed is evaluated.Using the Fourier transform on each sensor, the frequencies can be stacked into the output as (dkk)×2f × T tensor Xk, where dk is the measurement of the sensor attribute, f is the frequency, and Tis the session period. For the identification of the data from a number of sensors K, the K is setto a number of different sense modality. If there are more than two sensors which have the samesense modality, then it is considered as a multi-dimensional sensor whose value is set accordingly tothe measurement dimension. For having the best cost function, we can generate our own functionrather than having a default one. The Deep Learning Framework for HAR model is defined using thefunction F (·), using the training sample pairs as (X, y). The cost function can be given as follows:

L = l(F (X), y) +∑j

λjPj (3.1)

Here the (·) represents the loss function, Pj is the regularization or consequence function, and λj

hold the reputation of the regularization or the consequence function.

3.5. Human activity recognition

In this section, the cross validation of the HAR using the measurements of the accelerometer andgyro sensor is evaluated. In this manner, as indicated by the overall customization method, the humanactivity recognition is a classification-oriented issue with K = 2 (gyroscope and accelerometer). Forthe training objective the cross-entropy cost function is used which is given by

L = H(y, F (X)) (3.2)

The H represents the entropy for the given two distribution. The method of HAR achieves morehigher accuracy, which is given and proven is the simulation study.

4. Simulation study

Here experiment is conducted to validate the outcome achieved using hybrid deep learning modelfor human activity recognition with respect to existing CNN-based HAR model. The HAR model isimplemented using Python framework.

4.1. UCI HAR Dataset Description

The dataset was extracted from the UCI Machine Learning Repository. In this dataset of HAR,there were 30 contributors from which the data was taken. There was a total entry of 10,299 inwhich the dataset was divided into two for the testing and training. The dataset was divided into70% of the contributors having 7252 entries for the training and 20% of the contributors having 1470entries were used to test the accuracy of the model. The 30% of the contributors which had entriesof 2947 were used in testing the data. Several frames for all the contributors were used, the width ofthe frame was 256 signals, the sampling of the frame was done in a fixed-width sliding windows of2.56s and an overlap of 50%. The accelerometer and the gyro sensor reading for the three axes weretaken every 0.02s.

1232 Pushpalatha, Math

Figure 2: Graphical representation of Data provided by different user.

Table 1: Result achieved for open dataset.

Accuracy Precision Recall F1-ScoreYen, 2020 [15] 95.99 96.04 95.98 96.01CNN-GRU 96.79 96.93 96.78 97.82

The Figure 2 describes the different activity recorded by each user which is collected by UCIHAR data. From data we can see that each users provide all kind of activity; thus, there is no issuesof data imbalance. The Figure 3 describes the number of data points available for each activity.The Figure 4 shows the relationship between stationary and moving activity. From figure it canbe seen for stationary activity the frequency are higher; however on motion the frequency tend tobe very small. Similarly, Figure 5 shows the relationship between stationary (zoomed) and movingactivity. The Figure 6 defines the acceleration magnitude information of different style and Figure7 and 8 defines the angle X and angle Y gravity information, respectively. The Figure 9 showsthe training and validation loss outcome achieved using CNN-GRU-HAR model. The Figure 10shows the training and validation accuracies outcome achieved using CNN-GRU-HAR model. Theconfusion matrix outcome obtained for human activity recognition using CNN-GRU model is shownin Figure 11. The Table 1 show the outcome achived by proposed CNN-GRU based HAR modelover existing CNN-based HAR model. From result achived it can be seen the CNN-GRU based HARmodel achieves better precision, recall, F1-Score, and accuracies when compared with CNN-basedHAR model.

5. Conclusion

Here experiment is conducted to validate the outcome achieved using hybrid framework for humanactivity recognition with respect to existing HAR model. This model proposes a classification methodfor activity using the CNN and GRU. This model comprises of a recognition algorithm for the varioushuman activities. This algorithm takes the data from the sensors using the signals of the motion for

Hybrid deep learning framework for human activity recognition 1233

Figure 3: Graphical representation of Data provided for each activity.

Figure 4: Graphical representation of relationship between stationary and moving activities.

Figure 5: Graphical representation of relationship between stationary and moving activities.

1234 Pushpalatha, Math

Figure 6: Graphical representation of acceleration magnitude of different activity.

Figure 7: Graphical representation of angle X gravity of different activity.

Hybrid deep learning framework for human activity recognition 1235

Figure 8: Graphical representation of angle Y gravity of different activity.

Figure 9: Graphical representation of training loss and validation loss outcome in performing human activity recogni-tion.

Figure 10: Graphical representation of training accuracies and validation accuracies outcome in performing humanactivity recognition.

1236 Pushpalatha, Math

Figure 11: Graphical representation of ROC confusion matrix outcome in performing human activity recognition.

the three-axis acceleration and velocity. All the signals were normalized and then using the CNN-GRU model all the features were extracted automatically to detect the six day-to-day life activitiessuch as walking, sitting, walking downstairs, walking upstairs and lying from the dataset. Theoverall accuracy of the open UCI dataset achieved using existing CNN-based HAR model is 95.99%,respectively; however, our CNN-GRU HAR model achieves an accuracy of 96.79%. The final resultsshowed that the CNN-GRU HAR model can be used to detect the activities of the human using thesensors. This model can be used for the estimation of the rehab exercises of different individuals whohave less mobility, like the patient who undergo the dialysis. The proposed model and the algorithmwork efficiently and is feasible. Future work would consider improving performance of proposed HARdetection model considering more diverse dataset. Further, validate the CNN-GRU model for solvingregression problems.

References

[1] A.A. Aljarrah and A.H. Ali, Human activity recognition using PCA and BiLSTM recurrent neural networks, inProc. 2nd Int. Conf. Eng. Technol. its Appl. (IICETA), Al-Najef, Iraq, (2019) 156–160.

[2] L. Cantelli, G. Muscato, M. Nunnari and D. Spina, A jointangle estimation method for industrial manipulatorsusing inertial sensors, IEEE/ASME Trans. Mechatron. 20(5) (2015) 2486–2495.

[3] Y. Chen and C. Shen, Performance analysis of smartphone-sensor behavior for human activity recognition, IEEEAccess 5 (2017) 3095–3110.

[4] J. Chung, C. Gulcehre, K. Cho and Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequencemodelling, arXiv, (2014).

[5] M. Cornacchia, K. Ozcan, Y. Zheng and S. Velipasalar, A survey on activity detection and classification usingwearable sensors, IEEE Sensors J. 17(2) (2017) 386–403.

[6] N. Davies, D.P. Siewiorek and R. Sukthankar, Activity-based computing, IEEE Pervas. Comput. 7(2) (2008) 58–61.[7] N.B. Gaikwad, V. Tiwari, A. Keskar and N.C. Shivaprakash, Efficient FPGA implementation of multilayer

perceptron for real-time human activity classification, IEEE Access, 7(2019) 26696–26706.

Hybrid deep learning framework for human activity recognition 1237

[8] K. Greff, R.K. Srivastava, J. Koutnık, B.R. Steunebrink and J. Schmidhuber, Lstm: A search space odyssey,arXiv, (2015).

[9] I.C. Gyllensten and A.G. Bonomi, Identifying types of physical activity with a single accelerometer: Evaluatinglaboratory-trained algorithms in daily life, IEEE Trans. Biomed. Eng. 58(9) (2011) 2656–2663.

[10] Y.-L. Hsu, J.-S. Wang and C.-W. Chang, A wearable inertial pedestrian navigation system with quaternion-basedextended Kalman filter for pedestrian localization, IEEE Sensors J. 17(10) (2017) 3193–3206.

[11] A. Jain and V. Kanhangad, Human activity classification in smartphones using accelerometer and gyroscopesensors, IEEE Sensors J. 18(3) (2018) 1169–1177.

[12] M. Janidarmian, A. Roshan Fekr, K. Radecka and Z. Zilic, A comprehensive analysis on wearable accelerationsensors in human activity recognition, Sensors, 17(3) (2017) 529.

[13] E. Kantoch, Human activity recognition for physical rehabilitation using wearable sensors fusion and artificialneural networks, in Proc. Comput. Cardiol. Conf. (CinC), Rennes, France (2017) 1–4.

[14] S. Khalifa, G. Lan, M. Hassan, A. Seneviratne and S.K. Das, HARKE: Human activity recognition from kineticenergy harvesting data in wearable devices, IEEE Trans. Mobile Comput. 17(6) (2018) 1353–1368.

[15] S.-M. Lee, S.M. Yoon and H. Cho, Human activity recognition from accelerometer data using convolutional neuralnetwork, in Proc. IEEE Int. Conf. Big Data Smart Comput. (BigComp), Jeju, Island (2017) 131–134.

[16] Y. Liu, L. Nie, L. Liu and D.S. Rosenblum, From action to activity: sensor-based activity recognition, Neurocom-put. 181 (2016) 108–115.

[17] M.I.H. Lopez-Nava and M.M. Angelica, Wearable inertial sensors for human motion analysis: a review, IEEESensors J. 16(15) (2016).

[18] J. Margarito, R. Helaoui, A.M. Bianchi, F. Sartor and A.G. Bonomi, User-independent recognition of sportsactivities from a single wristworn accelerometer: A template-matching-based approach, IEEE Trans. Biomed.Eng. 63(4) (2016) 788–796.

[19] R. Poppe, Vision-based human motion analysis: An overview, Comput. Vis. Image Understand. 108(1–2) (2007)4–18.

[20] N. Ravi, N. Dandekar, P. Mysore and M.L. Littman, Activity recognition from accelerometer data, in Proc. AAAI5 (2005) 1541–1546.

[21] J.-L. Reyes-Ortiz, L. Oneto, A. Ghio, A. Sama, D. Anguita and X. Parra, Human activity recognition on smart-phones with awareness of basic activities and postural transitions, in Artificial Neural Networks and MachineLearning-ICANN. Cham, Switzerland: Springer, (2014) 177–184.

[22] M. Schuster and K.K. Paliwal, Bidirectional recurrent neural networks, IEEE Trans Sig. Process, 1997.[23] M. Seiffert, F. Holstein, R. Schlosser and J. Schiller, Next generation cooperative wearables: Generalized activity

assessment computed fully distributed within a wireless body area network, IEEE Access 5 (2017) 16793–16807.[24] A.S.A. Sukor, A. Zakaria and N.A. Rahim, Activity recognition using accelerometer sensor and machine learning

classifiers, in Proc. IEEE 14th Int. Colloq. Signal Process. Appl. (CSPA), Batu Feringghi (2018) 233–238.[25] N. Tufek and O. Ozkaya, A comparative research on human activity recognition using deep learning, in Proc. 27th

Signal Process. Commun. Appl. Conf. (SIU), Sivas, Turkey, Apr. (2019) 1–4.[26] N. Tufek, M. Yalcin, M. Altintas, F. Kalaoglu, Y. Li and S.K. Bahadir, Human action recognition using deep

learning methods on limited sensory data, IEEE Sensors J. 20(6) (2020) 3101–3112.[27] M. Ueda, H. Negoro, Y. Kurihara and K. Watanabe, Measurement of angular motion in golf swing by a local

sensor at the grip end of a golf club, IEEE Trans. Human-Machine Syst. 43(4) (2013) 398–404.[28] A. Wang, G. Chen, J. Yang, S. Zhao and C.-Y. Chang, A comparative study on human activity recognition using

inertial sensors in a smartphone, IEEE Sensors J. 16(11) (2016) 4566–4578.[29] J.-S.Wang, Y.-L. Hsu and J.-N. Liu, An inertial-measurement-unit-based pen with a trajectory reconstruction

algorithm and its applications, IEEE Trans. Ind. Electron. 57(10) (2010) 3508–3521.[30] K. Xia, J. Huang and H. Wang, LSTM-CNN architecture for human activity recognition, IEEE Access, 8(2020)

56855–56866.[31] W. Xu, Y. Pang, Y. Yang and Y. Liu, Human activity recognition based on convolutional neural network, in Proc.

24th Int. Conf. Pattern Recognit. (ICPR), Beijing, China, (2018) 165–170.[32] C.-T. Yen, J.-X. Liao and Y.-K. Huang, Human daily activity recognition performed using wearable inertial sensors

combined with deep learning algorithms, IEEE Access 8(2020) 174105–174114.[33] T. Zebin, P.J. Scully, N. Peek, A.J. Casson and K.B. Ozanyan, Design and implementation of a convolutional

neural network on an edge computing smartphone for human activity recognition, IEEE Access 7 (2019) 133509–133520.

[34] H. Zhang, Z. Xiao, J. Wang, F. Li and E. Szczerbicki, A novel IoT-perceptive human activity recognition (HAR)approach using multihead convolutional attention, IEEE Internet Things J. 7(2) (2020) 1072–1080.