RECOGNIZING SHORT DURATION HAND MOVEMENTS FROM ACCELEROMETER DATA
Narayanan C. Krishnan, Gaurav N. Pradhan, Sethuraman Panchanathan
Center for Cognitive Ubiquitous Computing,Department of Computer Science and Engineering,
Arizona State University,Tempe, AZ 85281
ABSTRACT
Processing of accelerometer data for recognizing short dura-
tion hand movements is a challenging problem. This paper
focuses on characterization of acceleration data correspond-
ing to hand movements (lift to mouth, scoop, stir, pour, un-screw cap) using aggregate statistical features and histograms
computed from raw acceleration and derivative of the accel-
eration data. Data collected from an accelerometer placed on
the wrist of subjects was used to perform the analysis. Sup-
plementing the statistical features with raw acceleration his-
tograms had a very marginal effect on the classification per-
formance. However, the addition of derivative histograms re-
sulted in a considerable improvement in the classification ac-
curacy by nearly 8%. The effect of bin size of the derivative
histograms was also conducted. It was observed that having a
small number of bins decreased the classification accuracy by
3%. We thus show that adding features that capture the dis-
tribution of the changes in the acceleration data improve the
classification performance.
Index Terms— feature extraction, histograms, accelerom-
eter.
1. INTRODUCTION AND MOTIVATION
Various approaches for activity recognition using wearable
pervasive sensors have been proposed depending on the target
application. For example, traditional accelerometer based ac-
tivity recognition systems have been used to monitor the over-
all physical activity [6] to provide general information about
behavior or energy expenditure of the subjects. On the other
hand, systems serving as rehabilitative and assistive technolo-
gies for providing cues to patients with Alzheimer’s disease
to complete the activities of daily life require recognition to
be performed at a finer level. For example, while performing
the activity of making a hot drink, the patient might leave the
kettle on the stove and become confused about the next task
to continue/complete the activity or get carried away with an-
other activity. The information about the overall activity as
determined in [5], is insufficient to take remedial measures.
The work in this paper is motivated by this need for detecting
the finer tasks constituting an activity.
This work focuses primarily on characteristic hand move-
ments that are associated with instrumental activities of daily
life (IADLs), in particular making a drink and drinking.
These activities are divided into five basic actions: Pour,
Scoop, Unscrew Cap, Stir, and Lift to Mouth. Each of these
individual actions are also part of many other IADLs. For
example, the action lift to mouth, which indicates a move-
ment of the hand towards the mouth, is a component of the
IADLs eating, drinking, and taking medication, and the ac-
tion scoop, is a part of other IADLs like eating and cooking.
Thus, a system trained to recognize actions as a part of one
set of activities can be easily adopted for other activities. The
actions that have been considered in this work are character-
ized by short duration movements of the hand and thus pose
a challenging recognition problem compared to movements
like walking, running and sitting that are characterized either
by a steady or repetitive movement. This paper explores
the multimedia aspects of accelerometer data processing for
recognizing these short duration hand movements. We ex-
tend traditional feature space employed for characterizing
the acceleration data, through raw acceleration and derivative
histograms. While a marginal improvement was recorded
with raw acceleration histograms, the derivative histograms
demonstrate a significant improvement.
1.1. Related Work
Much of the ongoing research focuses on prototyping wear-
able systems using a variety of sensors (accelerometers, mi-
crophones, pressure sensors, etc.) to recognize human ac-
tivities [2, 3, 4]. Miniaturized accelerometers have received
attention from the bioengineering community as an effective
tool to monitor the physical activity of an individual. [7] pro-
vides an exhaustive survey of the literature on accelerometer
based human activity recognition. Our prior work [7] on us-
ing accelerometer data for hand movement recognition used
elementary statistical and spectral features. In [5], the same
features are extracted for analyzing hand movements belong-
ing to other activities of daily life (such as dusting, ironing,
1700978-1-4244-4291-1/09/$25.00 ©2009 IEEE ICME 2009
brooming, etc). These features resulted in an accuracy of
70%, which is consistent with the performance recorded in [7]
with semi-naturalistic data. In [2], the authors use data from
accelerometers and gyroscopes for spotting hand movements
(such as eating, turning a switch on and off, etc). The deriva-
tive of the linear and angular acceleration values was used to
train a HMM. In [9], raw acceleration values are used to train
HMMs to distinguish between hand movements associated in
the context of operating a car, yielding a classification accu-
racy of 88%. However, when the same procedure was adopted
with the data for the hand movements considered as a part of
this study, the classification accuracy stood at 70%, clearly
indicating the need for discriminatory features.
The rest of the paper is organized as follows. Section 2 de-
scribes the experimental setup adopted in this work for gath-
ering the hand movement data along with a discussion on the
typical features extracted from accelerometer data. The anal-
ysis of the proposed histogram based features is presented in
Section 3. Section 4 summarizes the work and provides future
directions.
2. EXPERIMENTAL SETUP
2.1. Data Acquisition
In this work, we focused on five hand movements that consti-
tute the activities of making a powdered drink and drinking.
These actions are lift to mouth, unscrew the lid, pour, stir,
and scoop. Ideally, the data is acquired from the subjects as
they go about performing the activity of making a glass of
powdered drink and drinking it. A data capture session was
devised during which the subjects enacted the same move-
ments with mock objects a number of times, thereby provid-
ing sufficient data samples for training. For each of the five
actions, we designed alternate scenarios representing the ac-
tual actions needed to perform the activity. More details re-
garding the data capture can be found in [7]. All the subjects
in our experiments were college students aged 22-28 years,
and all of the subjects were right-handed. Each subject was
asked to perform each of the actions 20 times. The subject
started in a “rest” state, where the hands rested on the table,
and the subject began the action after receiving a cue from
the experimenter. While three accelerometers placed on the
dominant wrist and elbow and non-dominant wrist were used
for data collection in our prior paper, data from only the wrist
accelerometer was considered for the experiments conducted
in this paper.
2.2. Feature Space Analysis
Figure 1 illustrates the samples collected from a single subject
using the mock setup. These signals were obtained from the
accelerometer placed on the right wrist. The most evident ob-
servations are that samples are of varying length and that each
Fig. 1. A sample from each action belonging to a subject. The
Red, Green and Blue lines correspond to acceleration in X, Y
and Z axis.
action can be distinguished by observing the acceleration pat-
terns. For example, unscrewing the cap can be defined by
a number of rapid repetitive movements, while stir is repre-
sented by slower repetitive movements. A dip in the z axis
acceleration appears for the actions scoop and lift to mouth,
but the y axis values increase for scoop and fall significantly
for lift to mouth. Similar observations can be made for other
actions, leading to the conclusion that it is possible to differ-
entiate these actions using the accelerometer data.
Aggregate statistical and spectral features are typically
used for processing the accelerometer data corresponding to
ambulatory movements as elaborated in [1, 8, 6]. The effec-
tiveness of these features for characterizing the hand move-
ments considered in this paper have been explored in [7].
These features provide high level trends in the data, summa-
rizing through a single parameter. For example, for an action
sample lasting for nearly 3 seconds ( approximately 300 ac-
celeration samples) the mean, variance and correlation sum-
marize the contents of the data in a single parameter. While
these are distinctive features, the actual distribution of the
acceleration values in the action sample is not captured by
them. Thus, complementing this set with a feature that cap-
tures the information content of the entire sample, through a
distribution of the acceleration values, or even a distribution
of changes in the acceleration values computed through his-
tograms will enhance the overall discriminatory power.
Visual analysis of the data after combining the statistical
features with raw acceleration and derivative histograms was
performed using principal component analysis (PCA). Figure
2 illustrates the distribution of the samples characterized by
the combination of the statistical features and derivative his-
tograms reduced to a 3-dimensional space It can noticed that
samples belonging to the different classes form well separated
clusters. The samples corresponding to the action stir, have
the widest spread, overlapping with the samples of scoop.
Further analysis of the performance of combining these fea-
tures for classification was conducted and is discussed in the
following section.
1701
Fig. 2. Projection of the samples characterized by the combi-
nation of the statistical features and derivative histograms on
a 3-dimension space through PCA.
3. RESULTS
The effect of adding histogram based features was studied by
analyzing the classification performance of the system. Ad-
aBoost had yielded the best performance in our prior work [7].
Hence, it was considered as the base classifier for testing the
performance of the different feature spaces. AdaBoost com-
bines simple, weak learners like decision stumps into a pow-
erful classifier. Each of these weak learners are weighted at
every iteration and a combination of classifiers learned from
each iteration is used as the final classifier. A separate Ad-
aBoost classifier was trained for each class, considering the
data belonging to the class as positive samples and all other
classes as negative samples. AdaBoost defines the boundary
that separates the samples from each other. Each classifier
passed through a maximum of 100 iterations. Further, all the
results presented in this section were either averaged or ag-
gregated using a leave one out classification scheme. Thus,
these results will correspond to the most generalized subject
independent scenario.
All the comparisons are made with respect to a base per-
formance obtained by considering only the statistical features.
Figure 3 illustrates as a box plot the classification accuracy us-
ing the different combination of features. The red line corre-
sponds to the median accuracy obtained from the 5 fold leave
one out classification strategy. The first observation from here
is that the accuracies for both the raw acceleration histograms
(B) and the derivative histograms (D), are relatively low com-
pared to the statistical features (A). In fact, raw acceleration
histograms yield a significantly lower performance. How-
ever, combining the statistical features with these histograms
yielded a marginal improvement in case of raw acceleration
histograms (C) and a significantly higher performance in the
case of the derivative histograms (E).
Another important observation from Figure 3 is the
change in the extreme accuracy values indicated by the ver-
tical whiskers. In scenario A, one fold of the classification
resulted in nearly 54% accuracy, while the rest of the values
were within the variance of the accuracies indicated by the
Fig. 3. The is a box plot illustrating the overall AdaBoost
classification accuracy using A) Statistical and Spectral fea-
tures only, B) Normalized histogram of the acceleration data
C) Combination of the statistical features and normalized his-
togram of acceleration data, D) Normalized Histogram of the
first derivative of the acceleration data, E) Combination of the
statistical features and the normalized histogram of the first
derivative of the acceleration data.
Lift to
Mouth
Pour Scoop Unscrew
Cap
Stir
Lift to
Mouth
100(100) 0 0 0 0
Pour 1(2) 95(80) 4(18) 0 0
Scoop 0 0 94(86) 0(6) 13(15)
Unscrew
Cap
0(2) 0 0 90(87) 7(8)
Stir 0 0 0(12) 15 60(48)
Table 1. Aggregate confusion matrix obtained by using a
combination of derivative histograms and the statistical fea-
tures ( Values inside the parenthesis corresponds to the confu-
sion matrix values obtained with only the statistical features).
blue rectangle. However, in both C and E, it can be noticed
that the whiskers are shorter in length, indicating that the
extreme performance values is within reasonable bounds, es-
pecially in the case of E. The superior performance obtained
under the scenario E, indicates that the derivative histograms
capture additional information about the data, not described
by the statistical features.
Further analysis was performed to understand the effect
of the new features on individual class performances using a
combination of the derivative histograms and statistical fea-
tures. Table 1 presents the aggregate confusion matrix and
Table 2 lists the class-wise precision and recall values. The
average change in the precision and recall values for both the
scenarios was around 7% and 8% respectively. However, for
the action scoop, there was nearly 21% improvement in the
precision value, while the action stir demonstrated in 16% im-
provement in the recall rate. Thus the combination of the fea-
1702
Precision Recall
A E A E
Lift to Mouth 0.9615 0.9901 1 1
Pour 1 1 0.8 0.95
Scoop 0.7414 0.9592 0.8037 0.8785
Unscrew Cap 0.8056 0.8571 0.8969 0.9278
Stir 0.6761 0.75 0.64 0.8
Table 2. Precision and Recall values for each of the action:
A- statistical features only, E- combination of statistical and
derivative histograms.
Fig. 4. Change in the classification accuracy with respect to
varying number of bins in the histogram.
tures has a good impact on the performance of two actions that
were poorly characterized by the statistical features. It was
observed that the derivative values varied between -50 to 50,
peaking around the 0. Typical approaches using histograms
quantize the values by having lower number of bins. The pro-
cess of quantization is quite tricky, in the sense, bin centers
for one action will not be ideal for another. The AdaBoost
classification performance was recorded for histograms with
different bins centers and intervals. The effect of bin sizes on
classification accuracy is depicted in the Figure 4. A marginal
improvement was observed with varying bin sizes, with the
maximum accuracy being achieved when the number of bins
corresponded with the different derivative values.
4. SUMMARY
Recognizing hand movements involved in complex activities
of daily life is a challenging problem. Characterization of the
accelerometer data belonging to these movements is essen-
tial for achieving reliable classification. While the traditional
statistical features capture the content of the raw data to a
certain extent, supplementing it with features that represent
the distribution of the data such as histograms of the raw ac-
celeration and derivative of the acceleration data enrich the
feature space. The classification experiments using the Ad-
aBoost framework, conducted in this paper validate this hy-
pothesis. Adding the derivative histogram features results in
91.5% accuracy ( an improvement of 8%). We plan to explore
the frequency representations of the acceleration data such as
FFT and wavelet coefficients. As a part of our future work,
we plan to conduct classification analysis of the feature space
with a larger number of actions.
5. REFERENCES
[1] Bao. L and Intille S.S., Activity recognition from user-annotated acceleration data, In the Proceedings of Inter-
national Conference on Pervasive Computing, Vol. 3001,
2004, pp. 1–17.
[2] Holger Junker, Oliver Amft, et.al., Gesture spotting withbody worn inertial sensors to detect user activities, In
Pattern Recognition, Vol. 41(6), pp. 2010–2024, 2008
[3] Jamie A. Ward, Paul Lukowicz, et.al., Activity recogni-tion of assembly tasks using body worn microphones andaccelerometers, IEEE Transactions on Pattern Analysis
and Machine Intelligence, Vol. 28(10), 2006, pp. 1553–
1567.
[4] Jonathan Lester, Tanzeem Choudhury, and Gaetano
Borielle, A practical approach to recognizing physicalactivities, In the Proceedings of International Conference
on Pervasive Computing, Vol. 3968, 2006, pp. 1–16.
[5] Maja Stikic, Tam Huynh, et.al., ADL recognition basedon combination of RFID and accelerometer sensing, In
the Proceedings of International Conference on Pervasive
Health Care, 2008.
[6] Narayanan C. Krishnan and Sethuraman Panchanthan
Analysis of low resolution accelerometer data for con-tinuous human activity recognition, In the Proceedings of
International Conference on Acoustics, Speech and Sig-
nal Processing, 2008, pp. 3337-3340.
[7] Narayanan C. Krishnan, Dirk Colbry, et. al., Recognitionof hand movements using wearable accelerometers, ac-
cepted In Journal of Ambient Intelligent and Smart Envi-
ronments, special issue on Wearable Computing, 2009.
[8] Tam Huynh and Bernt Schiele, Analyzing features foractivity recognition, In the Proceedings of Joint Confer-
ence on Smart Objects and Ambient Intelligence: Innova-
tive Context-Aware Services: Usages and Technologies,
2005, pp. 159–163.
[9] Zinnen A. et.al., Toward recognition of short and non-repetitie activities from wearable sensors, In Ambient In-
telligence, Lecture Notes in Computer Science, 2007, pp.
142–158.
1703