PBN:TowardsPracticalActivityRecognitionUsing Smartphone ...lusu/cse726/papers/PBN Towards... · Lastly, several activity recognition methods do not pro-vide online training to account

PBN: Towards Practical Activity Recognition UsingSmartphone-Based Body Sensor Networks∗

Matthew KeallyDept. of Computer ScienceCollege of William and Mary

Williamsburg, VA 23187, USA

[email protected]

Gang ZhouDept. of Computer ScienceCollege of William and Mary


[email protected]

Guoliang XingDept. of Computer Science and

EngineeringMichigan State University

East Lansing, MI 48824, USA

[email protected]

Jianxin WuSchool of Computer EngineeringNanyang Technological University

Singapore 639798

[email protected]

Andrew PylesDept. of Computer ScienceCollege of William and Mary


[email protected]

AbstractThe vast array of small wireless sensors is a boon to body

sensor network applications, especially in the context aware-ness and activity recognition arena. However, most activityrecognition deployments and applications are challenged toprovide personal control and practical functionality for ev-eryday use. We argue that activity recognition for mobiledevices must meet several goals in order to provide a prac-tical solution: user friendly hardware and software, accurateand efficient classification, and reduced reliance on groundtruth. To meet these challenges, we present PBN: Practi-cal Body Networking. Through the unification of TinyOSmotes and Android smartphones, we combine the sensingpower of on-body wireless sensors with the additional sens-ing power, computational resources, and user-friendly inter-face of an Android smartphone. We provide an accurate andefficient classification approach through the use of ensemblelearning. We explore the properties of different sensors andsensor data to further improve classification efficiency andreduce reliance on user annotated ground truth. We evaluateour PBN system with multiple subjects over a two week pe-riod and demonstrate that the system is easy to use, accurate,and appropriate for mobile devices.

Categories and Subject DescriptorsC.3 [Special-Purpose and Application-Based Sys-

∗This work is supported in part by NSF grants ECCS-0901437,CNS-0916994, and CNS-0954039.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. To copy otherwise, to republish, to post on servers or to redistributeto lists, requires prior specific permission and/or a fee.SenSys’11, November 1–4, 2011, Seattle, WA, USA.Copyright 2011 ACM 978-1-4503-0718-5/11/11 ...$10.00

tems]: Real-time and embedded systems

General TermsAlgorithms, Design, Experimentation, Performance

KeywordsBody Sensor Networks, Mobile Phones, Motes, Machine

Learning, Sensing, Activity Recognition

1 IntroductionThe low cost and wide availability of wireless sensors

makes body sensor networks (BSNs) an increasingly attrac-tive solution for a wide range of applications such as personalhealth care monitoring [3], physical fitness assessment [1],and context awareness [19] [4]. In all of these applications,and especially in the context awareness and activity recogni-tion domain, the need exists to provide personal control anddirect application feedback to the user. For example, a sens-ing application for activity recognition allows user configu-ration of the number and placement of sensor nodes as wellas runtime notification as to the current activity. Completecontrol over a body sensor network allows a user to tailor theapplication to his or her needs as well as exercise discretionover the use and dissemination of activity inferences.

For context aware and activity recognition applications,the sensing capability of multiple on-body sensor nodes isdifficult to capture through the use of smartphones alone[17]. However, the portability, computational power, andinterface of a mobile phone can provide the user personalcontrol and feedback over the sensing application. We en-vision a practical solution to activity recognition for mobiledevices which is entirely portable, under direct control of theuser, computationally lightweight, and accurate.

In this paper, we focus on four major challenges for pro-viding practical activity recognition with mobile devices.First, the hardware and software platform must be user-friendly. The hardware must be portable and easily config-urable, while the software must provide an intuitive controlinterface with adequate system feedback. Second, classi-

fication must be performed accurately, handling both easyand difficult to classify activities, environmental changes,and noisy data. Accurate classification must also allow con-text switching between activities without extensive parame-ter tuning and different, complex classifiers for each activity.Third, since mobile hardware is often constrained in termsof computation power and energy, classification must be per-formed efficiently and redundant sensing resources must betimely turned off. Lastly, the system must have a reducedreliance on ground truth, for regular collection and labelingof sensor data can be invasive to the end user.

While there is significant existing work in the mobileactivity recognition domain, most approaches do not offerpractical solutions that address the above challenges. Someapproaches [9] [15] [28] provide multiple on-body sensornodes but do not provide a portable interface, such as a mo-bile phone, for the end user to control sampling, configurehardware, or receive real time feedback. Other approaches[18] [19] rely on computationally expensive learning algo-rithms and a back end server. Still more works [16] [17] relyon specific sensing models for each sensor modality, mak-ing sensor collaboration difficult. Lastly, other works [27][23] do not provide online training for adaptation to bodysensor network (BSN) dynamics. Compared to static sensornetworks, body sensor network dynamics include the chang-ing geographical location of the user, user biomechanics andvariable sensor orientation, as well as background noise.

Towards addressing these challenges of activity recog-nition, we propose PBN: Practical Body Networking. PBNconsolidates the flexibility and sensing capability of TinyOS-based motes with the additional sensing power, computa-tional resources, and user-friendly interface of an Androidsmartphone. Our solution can also be extended beyondTinyOS motes to combine Android with a wide variety ofUSB devices and wireless sensors. Through the use of en-semble learning, which automates parameter tuning for BSNdynamics, PBN provides a capable, yet lightweight activityrecognition solution that does not require a backend server.

With PBN, we provide an online training solution whichdetects when retraining is needed by analyzing the informa-tion divergence between training and runtime data distribu-tions and integrating this analysis with the ensemble classi-fier. In this way, PBN determines when retraining is neededwithout the need to request ground truth from the user. Fur-thermore, we investigate the properties of sensors and sen-sor data to identify sensors which are accurate and have di-verse classification results. From this analysis, we are ableto prevent the ensemble classifier from needlessly consum-ing computational overhead by using redundant sensors inthe online training process. Our main contributions are:

• We combine the sensing capability of on-body TinyOS-based motes with the sensors, computational power,portability, and user interface of an Android smartphone.

• An activity recognition approach appropriate for low-power wireless motes and mobile phones that does notrely on a backend server. Our approach handles BSN dy-namics without sophisticated parameter tuning and alsoaccurately classifies difficult tasks.

• We provide retraining detection without requestingground truth from the user, reducing the invasiveness ofthe system.

• We reduce online training costs by detecting redundantsensors and excluding them from the ensemble classifier.

• With two weeks of data from two subjects, we demon-strate that we can detect even the most difficult activitieswith nearly 90% accuracy. We identify 40% of sensorsas redundant, excluding them from online training.

The rest of this paper is organized as follows: Section2 presents related work, Section 3 provides an overview ofour PBN system design, and Section 4 describes the ensem-ble classifier. We describe our retraining detection methodin Section 5, provide details on how to collect new trainingdata in Section 6, and present a sensor selection method inSection 7. In Section 8, we evaluate our PBN platform andpresent conclusions and future work in Section 9.

2 Related WorkMany methods perform classification with multiple on-

body sensor nodes but have no mobile, on-body aggregatorfor sensor control and activity recognition feedback. Someworks [9] [15] [28] [22] use multiple on-body sensor motesto detect user activities, body posture, or medical conditions,but such motes are only used to collect and store data withanalysis performed offline. Such approaches limit user mo-bility due to periodic communication with a fixed base sta-tion and also lack real time analysis and feedback to the user.

Other approaches use mobile phones for sensing andclassification, but require the use of backend servers or of-fline analysis to train or update classification models. Theauthors of [18] provide model training with a short amountof initial data, but model updating is performed using a back-end server. In [19], model training and classification is splitbetween lightweight classifiers on a mobile phone and morepowerful classifiers on a server. The authors of [21] presenta sensor mote with an SD card attachment for interface witha mobile phone but do not provide an application which usessuch a device. Mobile phone like hardware is used in [7] toperform pothole detection, but data is analyzed offline.

Some works use a limited set of sensor modalities or usea separate classifier for each sensing modality, making clas-sification difficult for deployments with a large number ofheterogeneous sensors. Some works [2] [12] focus exten-sively on energy aware sensing models specifically for lo-calization or location tracking. In [16], while the authorsprovide an adaptive classification approach, they only makeuse of a mobile phone microphone to recognize user con-text and activities, thus eliminating a wide range of user ac-tivities that are not sound dependent. The authors of [17]provide a component-based approach to mobile phone basedclassification for different sensors and different applications,but each sensor component requires a separate classifier im-plementation. One approach uses both motes and mobilephones to perform fitness measurement [6], but simple sens-ing models are used that will not work for more general ac-tivity recognition applications. Activity recognition and en-ergy expenditure is calculated in [1] using an accelerometer-specific sensing model for on-body wireless nodes.

Lastly, several activity recognition methods do not pro-vide online training to account for environmental dynamicsor poor initial training data. The authors of [14] use Ada-Boost to perform activity recognition but use custom hard-ware with very high sampling rates. In [20], the authors alsouse AdaBoost for activity recognition with mobile phones,but focus mainly on ground truth labeling inconsistenciesand do not provide a practical system for long term use.The authors of [27] focus on duty cycling mobile phone sen-sors to save energy, but provide a rigid rule-based recogni-tion model that must be defined before runtime. A speakerand emotion recognition system using mobile phones is pre-sented in [23] which implements adaptive sampling rates butthe classification models used are trained offline.

3 System Overview and ArchitectureIn this section, we first present the application require-

ments. We then present our PBN hardware and application,which unifies TinyOS and Android. Next, we describe ourPBN architecture and finally describe our PBN experimentalsetup which we refer to for the remainder of the paper.

3.1 Application RequirementsOur PBN system design is motivated by the requirements

of a practical body networking application for activity recog-nition. Data from multiple on-body sensors is reported to amobile aggregator which makes classification decisions inreal time. The system must be able to accurately and effi-ciently classify typical daily activities, postures, and envi-ronmental contexts, which we present in Table 1. Despitethese categories being common for many individuals, previ-ous work [14] has identified some of them, such as cleaning,as especially difficult to classify.

Table 1: PBN Classification Categories.

Environment Indoors, Outdoors

Posture Cycling, Lying Down, Sitting, Standing, Walk-ing

Activity Cleaning, Cycling, Driving, Eating, Meeting,Reading, Walking, Watching TV, Working

From the table, we break down our target classificationsinto three categories: Environment, Posture, and Activity.With the Environment and Posture categories, we can pro-vide insight into the physical state of the user for personalhealth and physical fitness monitoring applications. We alsowish to measure typical activities in which a user engages,such as watching TV, driving, meeting with colleagues, andcleaning. Here, we consider cycling and walking to be bothpostures and activities. Such activity recognition is quite use-ful for participatory sensing and social networking applica-tions since it eliminates the need for a user to manually up-date his or her activities online. The requirements to providesuch a practical activity recognition system are:

• User-friendly. Different on-body commercial off-the-shelf (COTS) sensor nodes must seamlessly interact witha mobile phone aggregator for simple user configurationand operation. The hardware must be portable and easy

to wear, while the software must provide an intuitive in-terface for adding, removing, and configuring differentsensors geared to detect the user’s intended activities.Labeling training data should also be a simple and non-invasive task, facilitated by the mobile phone.

• Accurate classification. The system must accuratelyhandle both easy and difficult to detect activities as wellas noisy data and environmental dynamics. The systemmust also account for the changing geographic locationof the user as well as the variable orientation of the indi-vidual on-body sensors.

• Efficient classification. A body sensor network for ac-tivity recognition may often use low power hardware,therefore, the classification algorithm should be com-putationally efficient in addressing the BSN dynamicsof geographic location, user biomechanics, and environ-mental noise. The system must also be energy efficient:by quantifying the contribution of each sensor towardsaccurately classifying activities, sensors with minimalcontribution can be powered down. Furthermore, thesystem must avoid extensive parameter tuning as well asavoid unique, complex sensing models for each activity.

• Less reliance on ground truth. Activity recognitionsystems are often deployed with minimal labeled train-ing data, thus the need exists to train a system in an on-line manner, requesting ground truth labels only whenabsolutely necessary. A reduced need for ground truthreduces the burden on the user to label training data.

3.2 PBN Hardware and ApplicationTo achieve accurate and efficient activity recognition for

mobile phones and on-body sensors, we provide an exten-sive hardware and software support system, which we de-scribe in this section. In Section 3.2.1, we describe the im-plementation of USB host mode in the Android kernel toallow communication between a base station mote and anAndroid phone. In Section 3.2.2, we describe our Androidapplication for configuration and activity recognition.

3.2.1 Unification of Android and TinyOS

Our PBN system consists of Crossbow IRIS on-body sen-sor motes and a TelosB base station connected to an AndroidHTCG1 smartphone via USB. While previouswork has con-nected TinyOS motes and mobile phones, such efforts eitheruse energy demanding Bluetooth [6], do not provide mobilephone support for TinyOS packets [24], or use special hard-ware [29]. Instead, we provide a seamless and efficient in-tegration of TinyOS motes and Android smartphones. Oursolution can be easily extended beyond the research-basedTinyOS devices to work with a wide variety of commercialand more ergonomic USB and wireless sensors. Our chal-lenges with such integration lie in 4 aspects: TinyOS sensingsupport, Android OS kernel support, Android hardware sup-port, and TinyOS Java library support.

TinyOS Sensing Support. Illustrated in Figure 3, weimplement a sensing application in TinyOS 2.x for Cross-bow IRIS motes with attached MTS310 sensorboards. Thesensor node application allows for runtime configuration ofactive sensors, sampling rates, and local aggregation meth-

ods. We develop a communication scheme to pass controland data packets through a TelosB base station connected toan Android smartphone.

Android OS Kernel Support. To prepare the phone tosupport external USB devices, a kernel EHCI or host con-troller driver is required. However, the USB host controllerhardware documentation of the Google Developer phone isnot publicly available. We incorporate the suggestions from[5] and modify the Freescale MPC5121 driver to work withthe Qualcomm MSM7201A chipset on the Android phone.With these modifications, the host controller is able to rec-ognize USB devices with a caveat: enabling the HOST con-troller disables the USB client mode for PC SD card access.

Hardware Support. The EHCI controller of the phonedoes not provide any power to external devices, such as thesensor motes in our case. To solve this limitation, we build aspecial USB cable that includes an external 5V battery packwith a peak load of 0.5A. The positive and ground cables arecut on the phone side such that only the USB device, not thephone, receives power. This also has the added benefit of notplacing an extra load on the phone battery.

TinyOS Support on Android. Two challenges exist inprovidingAndroid TinyOS communication support: systemsimplementation and TinyOS software modifications. 1) Eachmote has an FTDI USB serial chip that can be used to com-municate with the host. The Linux FTDI driver creates a se-rial port device in the /dev directory. Android includes a min-imal device manager that creates devices with insufficientprivileges. This device manager is modified so that correctprivileges are established. 2) TinyOS uses a binary serializa-tion to communicate bidirectionally between the mote andhost with the help of a C++ JNI interface. We modify theTinyOS JNI interface and associated Java libraries to com-pile and function on the Android platform. With such modi-fications, Android applications can send and receive TinyOSpackets using the same Java interfaces available on a PC.

3.2.2 Android App

Figure 1: PBN activity sta-tus view.

Figure 2: Ground truth log-ging.

To provide a user-friendly front end for PBN, we imple-ment an Android app to allow for easy configuration, run-

time deployment, ground truth labeling, and data storageand upload for offline analysis. The GUI component allowsfor user control of both phone and remote wireless sensors.The GUI also receives feedback as to the current activity andwhether or not classifier retraining is needed, and also pro-vides ground truth labels to the classifier.

Sensor Configuration. Our PBN Android app providesan interface to allow for easy configuration of both phonesensors as well as remote TinyOS sensors. Using our soft-ware interface, a user can easily add or remove TinyOSnodes as well as phone and TinyOS-based sensors. The useris also able to adjust sampling rates as well as local dataaggregation intervals to reduce the number of radio trans-missions. A user’s sensor and sampling configuration canbe stored on the phone in an XML file so that configurationmust only be performed once. Users of different phones canalso exchange saved sensor configurations.

Runtime Control and Feedback. With a created sensorconfiguration, a PBN user is able to quickly start and stopdata sampling and activity recognition. As depicted in Fig-ure 1, the current activity is displayed during runtime alongwith a configurable histogram of the most recent activities.When our classifier determines that accuracy is dropping andretraining with more labeled ground truth is needed, the PBNapp prompts the user to input the current activity, illustratedin Figure 2. During retraining, an indicator and button onthe current activity screen appears, allowing the user to logany changes in the ground truth. Labeled training data canbe stored locally for later retraining or uploaded to a serverfor sharing and offline analysis.

3.3 PBN Architecture

Figure 3: PBN System Architecture.

Illustrated in Figure 3, our Practical Body Networking(PBN) system resides solely on TinyOS-based motes and onan Android smartphonewith no reliance on a backend server.Multiple TinyOS-based motes, each containing one or moresensors of different modalities, are attached on-body. Eachmote communicates wirelessly (dotted lines) with a TinyOSmote base station, which is connected via USB (solid lines)to an Android smartphone.

Phone and mote sensors (white dotted areas) sample dataat a user configured rate and aggregate data from each sen-sor over a larger aggregation interval. Aggregated data foreach sensor is returned at each aggregation interval. For eachTinyOSmote, aggregated data is returned wirelessly in a sin-gle packet. We provide a reliable communication schemebetween the phone and motes and determine through our 2week experiment that for most activities, packet loss rates arebelow 10%, with 99% of packets received after one retrans-mission, even when using the lowest transmission power.

Aggregated data from phone and mote sensors is fed intothe PBN classification system (gray dotted area in Figure 3)at each aggregation interval to make a classification decision.The classifier, AdaBoost, is initially trained with the user la-beling a short two minute period of each pre-defined activ-ity (Ground Truth Management) but training can be updatedonline through Retraining Detection. During initial train-ing and retraining, the Sensor Selection module reduces thetraining overhead incurred by AdaBoost by choosing onlythe most capable sensors for performing accurate classifica-tion. We now describe the core of our PBN platform:

Activity Classification. We use AdaBoost [8], an en-semble learning algorithm, as our activity recognition classi-fier which resides entirely on a smartphone. AdaBoost com-bines an ensemble of weak classifiers together to form a sin-gle, more robust classifier. With this approach, we are ableto train weak classifiers for each sensor in our deploymentand combine them together to recognize activities. Ada-Boost is able to maximize training accuracy by selectingonly the most capable sensors for use during runtime. Weimprove upon AdaBoost by providing online training andretraining detection as well as improve computational over-head through sensor selection.

Retraining Detection. Since initial training data may notbe sufficient to account for body sensor network dynamics,AdaBoost retraining is needed in order to ensure high ac-curacy throughout the deployment lifetime. We investigatethe discriminative power of individual sensors and from ouranalysis we are able to detect when retraining is needed dur-ing runtime without the use of labeled ground truth. For eachsensor, we use Kullback-Leibler divergence [13] to deter-mine when runtime data is sufficiently different from trainingdata and use a consensus-based approach to initiate retrain-ing when enough sensors indicate that retraining is needed.

Ground Truth Management. When retraining isneeded, we investigate how much new data to collect andlabel in order to ensure BSN dynamics are captured, yet min-imize the intrusiveness of the user manually annotating his orher current activities. We also determine how to maintain abalance of training data for each activity to ensure AdaBoosttrains properly and provides maximum runtime accuracy.

Sensor Selection. During training, AdaBoost trains mul-tiple weak classifiers for every sensor in the deployment,even if many sensors are never chosen by AdaBoost whentraining is complete. Through analysis of the theory behindensemble learning, we identify both helpful and redundantsensors through the use of the Pearson correlation coefficient[25]. Based on our analysis, we provide a method to giveAdaBoost only the sensors that provide a significant contri-

bution towards maximizing accuracy, thus reducing onlinetraining overhead.

3.4 Experimental Setup

Figure 4: Subject 1. Figure 5: Subject 2.

For the remainder of the paper, we will refer to our activ-ity recognition experiment, in which we collected two weeksof sensor data using two subjects, depicted in Figures 4 and5. Each subject wore five Crossbow IRIS motes wirelesslylinked to a TelosB base station and Android HTC G1 smart-phone. The mote and sensor configuration for our experi-ment is summarized in Table 2. On the phone, which weattach to the waist, we make use of the 3-axis accelerome-ter as well as velocity from WiFi and GPS, with GPS activeonly when PBN detects the user is outdoors. On the mote,we use an MTS310 sensorboard with the following sensors:2-axis accelerometer, microphone, light, and temperature. Inaddition to the sensors on the mote, the base station also col-lects RSSI information from each received packet, which hasbeen previously demonstrated [22] to provide insight intobody posture. During initial and online training, all sensorsare active, while only sensors selected during training remainactive during the remaining sampling periods.

Table 2: PBN Deployment Configuration.

Node ID Location Sensors

Phone 0 R. Waist 3-Axis Acc., GPS/WiFi (velocity)

IRIS 1 L. Wrist 2-Axis Acc., Mic., Light, Temp.

IRIS 2 R. Wrist 2-Axis Acc., Mic., Light, Temp.

IRIS 3 L. Ankle 2-Axis Acc., Mic., Light, Temp.

IRIS 4 R. Ankle 2-Axis Acc., Mic., Light, Temp..

IRIS 5 Head 2-Axis Acc., Mic., Light, Temp.

For the microphones and accelerometers, raw ADC val-ues are sampled at 20ms intervals to ensure quick bodymovements can be captured, with light and temperature ADCreadings sampled at 1s intervals, and GPS/WiFi sampled ev-ery 10s. To reduce communication overhead, data for eachsensor is aggregated locally on each node at 10s intervals,

which is well within the time granularity of the activities weclassify. To reduce the complexity of local aggregation, datafrom each accelerometer axis is treated as a separate sensor.During local aggregation, light and temperature sensor read-ings are averaged since these sensor readings remain rela-tively stable for each activity. Except for GPS/WiFi, all othersensors compute the difference between the highest and low-est readings for each aggregation interval, for the change inreadings indicate body movement or sound.

During the two week period, both subjects recorded allactivity ground truth in order to evaluate the accuracy oftraining data (training accuracy) and runtime accuracy. Werecorded ground truth for 3 classification categories, illus-trated in Table 1: Environment, Posture, and Activity.

4 AdaBoost Activity RecognitionThe core of our activity recognition approach uses en-

semble learning, specifically AdaBoost.M2 [8], which weexpand and improve upon in subsequent sections. In this sec-tion, we explain how we adapt AdaBoost to run on a phonefor use with a body sensor network. AdaBoost is lightweightenough for mobile phones, yet previous work [14] [20],while relying on offline processing with no feedback or usercontrol, has demonstrated AdaBoost to be accurate for clas-sification applications. Furthermore, without user parametertuning, our implementation of AdaBoost is able to choosethe right sensors needed to maximize training accuracy forall activities. Other classifiers commonly used for activ-ity recognition use a combination of complex, specializedclassifiers per sensor modality [17] [19] or per activity [16]which require extensive parameter tuning. Other techniquesuse single classifiers which are computationally demandingfor mobile phones, such as GMMs [18] or HMMs [9].

Using AdaBoost, we incrementally build an ensemble ofcomputationally inexpensive weak classifiers, each of whichis trained from the labeled training observations of a singlesensor. Weak classifiers need only to make classification de-cisions that are slightly correlatedwith the ground truth; theircapabilities are combined to form a single accurate classifier.The completed ensemble may contain multiple weak classi-fiers for the same sensor; some sensors may not have trainedclassifiers in the ensemble at all. AdaBoost incrementallycreates such sensor-based weak classifiers by emphasizingthe training observations misclassified by previous classi-fiers, thus ensuring that training accuracy is maximized.

Using Algorithm 1, we describe AdaBoost training. Wedefine a set of activities A = {a1, . . . ,aa}, sensors S ={s1, . . . ,sm}, and observation vectors O j for each sensors j ∈ S, where each sensor has n training observations. Thetraining output is an ensemble of weak classifiers H ={h1, . . . ,hT}, where ht ∈ H represents the weak classifier

chosen in the t th iteration. We initialize a set of equal weightsD1 for each training observation, where during the trainingprocess, greater weights for an observation represent greaterclassification difficulty.

During each iteration t, we train a weak classifier ht, j foreach sensor s j ∈ S using observations O j and weights Dt .We then compute the weighted classifier error εt, j for eachtrained sensor classifier, adding only the sensor classifier to

H which has the lowest weighted error. Before the next it-eration, the observation weights Dt are updated based on thecurrent weights and the misclassifications made by the se-lected classifier.

Algorithm 1 AdaBoost Training

Input: Max iterations T , training obs. vector O j for eachsensor s j ∈ S, obs. ground truth labels

Output: Set of weak classifiers H1: Initialize observation weights D1 to 1/n for all obs.2: for t = 1 to T do3: for sensor s j ∈ S do4: Train weak classifier ht, j using obs. O j, weights Dt

5: Get weighted error εt, j for ht, j using labels [8]6: end for7: Add the ht, j with least error εt to H by choosing ht, j

with least error εt8: Set Dt+1 using Dt , misclassifications made by ht [8]9: end for

Given an observation o, each weak classifier returns aprobability vector [0,1]a with each scalar representing theprobability that the current activity is ai. To train a weakclassifier ht, j for each sensor s j ∈ S, we use a naive Bayesmodel where training observations O j are placed into oneof 10 discrete bins. The bin interval is based on the mini-mum and maximum possible readings for each sensor. Eachbinned training observation from O j is assigned its respec-tive weight from Dt and ground truth label in the set of ac-tivities A. A new observation is classified by placing it into abin, with the generated probability output vector correspond-ing to the weights of training observations present for eachactivity in the assigned bin. With a weak classifier chosen foreach iteration, Equation 1 defines the output of the AdaBoostclassifier for each new observation o during runtime:

h(o) = argmaxai∈A

T

∑t=1

(

log1− εt

εt

)

ht(o,ai) (1)

In Equation 1, the activity probability vector for eachweak classifier ht is weighted by the inverse of its error εt .Thus, the weak classifiers with the lowest training error havethe most weight in making classification decisions. To put itanother way, AdaBoost chooses the sensors with weak clas-sifiers that minimize weighted training error, achieving max-imum training accuracy for all activities.

In Figure 6, we depict AdaBoost training and runtimeperformance for T = 1 to 300 AdaBoost iterations using datafrom Subject 1 and ground truth from the Activity classifica-tion category. For this paper, we use 300 iterations to achievemaximum accuracy. We also show in Figure 6 that AdaBoostdoes not choose all of the 34 sensors in our experiment, foreven with 300 iterations it only chooses 18. During runtime,any sensor not selected by AdaBoost will be disabled to saveenergy and communication costs.

Lastly, in Figure 7, we demonstrate that AdaBoost canperform very accurately with a sizable amount of trainingdata: maximum training and runtime accuracy is achievedwith roughly 50 observations per activity for Subject 1.

0

5

10

15

20

25

0 50 100 150 200 250 300 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1S

en

so

rs C

ho

se

n b

y A

da

Bo

ost

Accu

racy

AdaBoost Iterations

Cluster SizeTraining Accuracy

Runtime Accuracy

Figure 6: Training and runtime accuracy for 1-300 Ada-Boost iterations.

0.5

0.6

0.7

0.8

0.9

1

10 20 30 40 50 60 70 80 90 100

Ave

rag

e A

ccu

racy

Training Observations per Activity

Training Accuracy Runtime Accuracy

Figure 7: Average accuracy and standard deviation for 10-100 random training observations per activity.

CleaningCyclingDrivingEating

MeetingTV

WalkingReadingWorking

0,P

-LO

C0

,P-A

CC

-X0

,P-A

CC

-Y0

,P-A

CC

-Z1

,RS

SI

1,A

CC

-X1

,AC

C-Y

1,M

IC1

,LIG

HT

1,T

EM

P2

,RS

SI

2,A

CC

-X2

,AC

C-Y

2,M

IC2

,LIG

HT

2,T

EM

P3

,RS

SI

3,A

CC

-X3

,AC

C-Y

3,M

IC3

,LIG

HT

3,T

EM

P4

,RS

SI

4,A

CC

-X4

,AC

C-Y

4,M

IC4

,LIG

HT

4,T

EM

P5

,RS

SI

5,A

CC

-X5

,AC

C-Y

5,M

IC5

,LIG

HT

5,T

EM

P

0

2

4

6

8

10

12

14

KL

Div

erg

en

ce

Figure 8: K-L divergence for each sensor and each activity.

0.4

0.5

0.6

0.7

0.8

0.9

1

0 2 4 6 8 10 12 14

Se

nso

r A

ccu

racy

Sensor K-L Divergence

CyclingCleaning

DrivingEating

MeetingReading

TVWalkingWorking

Figure 9: Sensor K-L divergence and training accuracy.

However, through retraining in Section 5 we can train Ada-Boost with a very small set of initial training data and updateAdaBoost during runtime.

5 Retraining DetectionIn this section, we propose an online training approach

to achieve high accuracy with a limited initial training dataset. This approach can also be used to retrain when an exist-ing data set is not accurate enough. First, we investigate howto quantify the discriminative power of each sensor to de-tect when runtime data is significantly different than trainingdata. Second, we use the insight we gain from the discrimi-nation analysis to detect when AdaBoost retraining is neededduring runtime without retrieval of ground truth information.

5.1 Sensor Data Discrimination AnalysisWe investigate how to quantify the discriminative power

of each sensor and predict if it is accurate, employing the useof Kullback-Leibler divergence [13]. K-L divergence mea-sures the expected amount of information required to trans-form samples from a distribution P into a second distributionQ. Using trace data from Subject 1, we demonstrate the useof K-L divergence per activity to identify which sensors arebest able to distinguish between activities. We also show aclear relationship between K-L divergence per activity andtraining accuracy. We conclude that K-L divergence canbe used to detect when retraining is needed without regularground truth requests to compute classifier accuracy: sensorsneed only to compare training data to current runtime data.

In Figure 8, we analyze the ability of each sensor to dis-criminate between different activities; we calculate K-L di-vergence as “one against the rest,” for one data distributionis calculated for the target activity and another distribution is

calculated for all other activities. For all analyses using K-Ldivergence, we discretize continuous-valued sensor data into100 bins. The figure shows that some sensors, such as thoseon the hands (nodes 1 and 2) have poor ability to distinguishbetween any activity. Conversely, sensors on the feet (nodes3 and 4) are especially good at detecting activities that in-volve motion, such as walking and cycling. The GPS/WiFivelocity (0-LOC) has the highest K-L divergence of all fordetecting driving, with a value of nearly 14.

In Figure 9, we compare K-L divergence per sensor andactivity to individual sensor training accuracy using Near-est Centroid [11]. Nearest Centroid is a variant of k-meansclustering and unlike the AdaBoost weak classifiers, it is aninexpensive classifier that does not require a weight for eachtraining observation. The figure shows a clear relationshipbetween K-L divergence and runtime accuracy, indicatingthat for a given activity, sensors with high K-L divergenceare much more likely to have high accuracy compared withsensors with low K-L divergence. As with Figure 8, activi-ties involving motion, such as cycling, walking, and drivingare most easily distinguished and have the highest accuracy.From Figures 8 and 9, we can conclude that K-L divergencewill allow sensors to tell activities apart and can even be usedto tell if training data and runtime data for the same activityare sufficiently different that retraining is needed.

5.2 Consensus-based DetectionDuring runtime, retraining detection is performed in two

steps, using the insight gained in Section 5.1. First, at eachaggregation interval, each active sensor independently deter-mines if retraining is needed. Second, for some aggrega-tion interval, if enough sensors determine that retraining is

needed, PBN prompts the user to record ground truth for ashort period. During this period, all sensors are woken upand sample data which is then labeled by the user. Whenthe retraining period completes, a new AdaBoost model istrained using both the old and new training data.

Step 1. Individual Sensor Retraining Detection. InSection 5.1, we demonstrate in that K-L divergence is apowerful tool for determining sensor classification capabil-ity. Here, we use the discriminative power of K-L divergenceto determine when retraining is needed for each sensor. Fora sensor to determine retraining is needed, two conditionsmust hold: 1) The runtime data distribution for the currentactivity must be significantly different from the training datadistribution, and 2) The runtime data distribution must haveat least as many observations as the training data distribution.

During training, each sensor chosen by AdaBoost com-putes K-L divergence for each activity using its labeled sen-sor training data. Training K-L divergence per activity isused as a baseline to compare against during runtime to de-termine when retraining is needed. For each activity ai ∈ A,training sensor data distribution Ti for activity ai, and train-ing sensor data distribution To for all activities a j ∈ A\{ai},each sensor computes the K-L divergence between Ti and To,DKL(Ti,To). Then, during runtime, for each new observa-tion, AdaBoost classifies the observation as activity ai ∈ A,and each active sensor adds its data to a runtime distributionRi for activity ai. An active sensor s j determines retraining isneeded when the runtime-training K-L divergence is greaterthan the training K-L divergence for the current activity ai:DKL(Ri,Ti)> DKL(Ti,To).

During runtime, to ensure a fair comparison betweentraining and runtime K-L divergence, each sensor does notdetermine if retraining is needed until the runtime distribu-tion Ri has as at least as many observations as the trainingdata distribution Ti. We determine through evaluation thatcollecting fewer runtime observations per activity yields sim-ilar accuracy but more retraining instances, which impartsa greater burden on the user. Since Figure 7 demonstratesthat 100 observations per activity is more than sufficient toachievemaximum runtime accuracy (this is also true for Sub-ject 2), we limit training data to 100 observations per activityto reduce delay before the training observations and runtimeobservations are compared.

Step 2. Consensus Detection: Combining Individ-ual Sensor Decisions. During each runtime interval, PBNchecks to see how many sensors indicate retraining is neces-sary. If a weighted number of sensors surpasses a threshold,new ground truth is collected and a new AdaBoost classi-fier is trained. We describe how this consensus threshold forretraining is obtained.

First, upon completion of AdaBoost training, we can de-termine the AdaBoost weight of the sensor classifiers used tomake correct classification decisions. Through an extensionof Equation 1, in Equation 2, we can output the weight of thecorrect weak classifiers given an observation o and correctAdaBoost decision ai, w(o,ai):

w(o,ai) =T

∑t=1

(

log1− εt

εt

)

ht(o,ai) (2)

Next, using training data, a trained AdaBoost classifier,and Equation 2, we determine the average weight wavg forall correct training classification decisions (o,ai). To do this,we compute a weight w j for each active sensor s j, indicat-ing how important its decisions are relative to other sensors.Depicted in Equation 3, w j is computed as the sum of theinverse error rates for each weak classifier belonging to sen-sor s j (multiple classifiers for s j may be iteratively trainedand chosen by AdaBoost). Function f ( j) maps sensor s j toeach AdaBoost iteration t where AdaBoost chooses the weakclassifier trained by s j.

w j = ∑∀t∈ f ( j)

log

(

1− εt

εt

)

(3)

At each runtime interval, we sum the weights w j for eachsensor s j that indicates retraining is necessary. If the sum ofthe sensor weights is greater than wavg, PBN notifies the userto collect ground truth and retrain a newAdaBoost classifier.

6 Ground Truth ManagementWe address two issues regarding labeling and addition of

new sensor data to the existing training dataset during re-training. First, we address how much new data to add to thetraining set. Second, we show how to maintain a balance be-tween training data set sizes for each activity to ensure Ada-Boost retrains properly and maintains high runtime accuracy.

When PBN decides retraining is required, it will promptthe user to log ground truth for a window of N aggregationintervals. Through evaluation, we find that recording groundtruth retroactively for the data that triggered the retrainingresults in no change in accuracy since any such significantchange in sensor data is persistent. During the ground truthlogging period, a user only notes the current activity at thestart of the logging period and any activity changes through-out the remainder of the logging period. In the evaluation,we determine that with N = 30 (5 minutes), the retrainingground truth collection window is short enough not to beintrusive to the user but long enough to capture changes inPBN dynamics. When the ground truth labeling period iscomplete, the new labeled sensor data is added to the exist-ing training set and a new AdaBoost model is trained.

Since the core of AdaBoost training relies on creatinga weight distribution for all training observations based onclassification difficulty, each activity must be given nearlyequal amounts of training data (within an order of magni-tude) for AdaBoost to train properly [10]. Without a bal-ance in training observations across all activities, AdaBoostwill focus on training activities with more training obser-vations, creating poor runtime accuracy for activities withfewer training observations. However, if we are too restric-tive in enforcing such a balance, very few new training obser-vations will be added to the training dataset during retrain-ing, resulting in poor adaptation to the new data. In Equation4, we ensure that each activity ai ∈ A has no more than δ

times the average number of training observations per activ-ity, where O is the set of training observations.

|Oi|−1|A| ∑∀ak∈A |Ok|

1|A| ∑∀ak∈A |Ok|

≤ δ (4)

0,P-ACC-X0,P-ACC-Y0,P-ACC-Z

1,RSSI1,ACC-X1,ACC-Y

1,MIC1,LIGHT1,TEMP2,RSSI

2,ACC-X2,ACC-Y


3,ACC-X3,ACC-Y


4,ACC-X4,ACC-Y


5,ACC-X5,ACC-Y

5,MIC5,LIGHT5,TEMP

0,P

-LO

C0,P

-AC

C-X

0,P

-AC

C-Y

0,P

-AC

C-Z

1,R

SS

I1,A

CC

-X1,A

CC

-Y1,M

IC1,L

IGH

T1,T

EM

P2,R

SS

I2,A

CC

-X2,A

CC

-Y2,M

IC2,L

IGH

T2,T

EM

P3,R

SS

I3,A

CC

-X3,A

CC

-Y3,M

IC3,L

IGH

T3,T

EM

P4,R

SS

I4,A

CC

-X4,A

CC

-Y4,M

IC4,L

IGH

T4,T

EM

P5,R

SS

I5,A

CC

-X5,A

CC

-Y5,M

IC5,L

IGH

T

-1

-0.5

0

0.5

1

Corr

ela

tion

Figure 10: Raw data correlation for Subject 1.

The limit imposed in Equation 4 allows AdaBoost toplace equal emphasis on training weak classifiers for eachactivity during retraining. If, during retraining, an activityexceeds this limit, data is removed until the number of obser-vations is under the limit. We remove observations at randomto reach the activity limit since we do not know how rep-resentative each observation is of its labeled activity; someobservations may contain more noise than others.

7 Sensor Selection for Efficient ClassificationWhile AdaBoost, through training, provides a measure

of sensor selection by weighting the most accurate sensors,this approach can be computationally demanding: at eachAdaBoost training iteration, a weak classifier is trained andevaluated for each sensor. In this section, we focus on identi-fying redundant sensors and excluding them from AdaBoosttraining to reduce the number of weak classifiers trained byAdaBoost. To achieve this goal, we first investigate whyensemble learning algorithms are successful: weak classi-fiers must be relatively accurate and different weak classi-fiers must have diverse prediction results [30]. Second, byidentifying sensors that satisfy these properties, we providea method to detect redundant sensors during runtime and ex-clude them from input to AdaBoost during online retraining.Our method adapts to BSN dynamics during runtime to en-sure only helpful sensors are used by AdaBoost.

7.1 Identifying Sensing RedundanciesWith ensemble learning and AdaBoost, each weak classi-

fier in the ensemble must be slightly correlated with the trueclassification to achieve high runtime accuracy. Since weuse data from a single sensor to train each weak classifier,it follows that sensors chosen by AdaBoost will be similarlycorrelated both in terms of raw data and in classification de-cisions made by each weak classifier. While the Pearson cor-relation coefficient [25] is also used in previous works [26][31] to identify packet loss relationships between differentwireless radio links; here we use correlation to identify sens-ing relationships between different sensors to identify both

sensors that are helpful as well as redundant.Figure 10 depicts the correlation between each sensor

pair using the raw sensor data collected from Subject 1. It isnoted that several correlation patterns exist: accelerometersare strongly correlated as are light and temperature sensors.We can use this information to find sensors with redundantdata and remove them from the AdaBoost training process tosave computational overhead as well as energy consumption.

-0.2

0

0.2

0.4

0.6

-0.4 -0.2 0 0.2 0.4 0.6 0.8

Accu

racy C

ha

ng

e R

atio

with

Ne

w S

en

so

r

Decision Correlation

Figure 11: Decision corre-lation between individual sen-sors and sensor clusters.

We next illustratethat not only do corre-lation patterns exist be-tween raw sensor databut also between sensorand sensor cluster clas-sifier decisions. In Fig-ure 11, with data fromSubject 1, we show thatwhen we add a new sen-sor to an existing sen-sor cluster, the great-est accuracy increase isachieved when the sen-sor and cluster have un-correlated classificationdecisions. The figureshows the decision correlation and accuracy change fornearly 7,000 randomly generated clusters from size 1through 19 using data from Subject 1, with individual sen-sor and sensor cluster classifiers trained using Nearest Cen-troid [11]. To compute the decision correlation for a classi-fier, each correct decision is recorded as 1 and each incorrectdecision is recorded as 0. From the figure, it is clear thatchoosing sensors with a decision correlation close to 0 canhelp select sensors that will make the most contribution to-wards an accurate sensor cluster.

7.2 Adaptive Runtime Sensor SelectionWe now describe how to reduce the number of sensors

given as input to AdaBoost during online retraining, reduc-ing retraining overhead while achieving high runtime accu-racy. We perform sensor selection using raw sensor datacorrelation, also extending the concept to sensor and sensorcluster classifier decision correlation.

Sensor selection consists of two components: ThresholdAdjustment and Selection. During Threshold Adjustment,using a trained AdaBoost classifier, a threshold α is com-puted which discriminates between sensors chosen by Ada-Boost and unused sensors. During Selection, a previouslycomputed α value is used to select a set of sensors for in-put to AdaBoost during retraining. Threshold Adjustmentis performed during initial training while Selection is per-formed during subsequent retrainings. To ensure the selec-tion threshold stays current with BSN dynamics, we updatethe threshold α periodically during runtime retraining usingThreshold Adjustment and apply smoothing using a movingwindow of previous α values.

Threshold Adjustment. During initial training, we ini-tialize a sensor selection threshold α and update it duringruntime retraining to adjust to any changes in user geograph-ical location, biomechanics, or environmental noise. To de-

Algorithm 2 Raw Correlation Threshold for Sensor Selec-tion

Input: Set of sensors S selected by AdaBoost, training ob-servations for all sensors O, multiplier n

Output: Sensor selection threshold α

1: R= /0 // set of correlation coefficients2: for all combinations of sensors si and s j in S do3: Compute correlation coefficient r = |rOi ,O j

|

4: R= R∪{r}5: end for6: // compute threshold as avg + (n * std. dev. ) of R7: α = µR+ nσR

termine the threshold α, we first train AdaBoost with all sen-sors as input and determine S, the set of sensors selected byAdaBoost. Using Algorithm 2, we determine the correlationcoefficient r of raw sensor training data for each combina-tion of sensor pairs in S. With each correlation coefficientr stored in set R for all sensors selected by AdaBoost, wedetermine the mean correlation coefficient µR and standarddeviation σR. We then set the threshold α to be n times thestandard deviation above the mean correlation. We deter-mine through empirical evaluation that n = 2 is sufficient toinclude nearly all sensors selected by AdaBoost but excludeunused sensors.

Selection. During retraining, when the threshold is notbeing updated, we choose a set of sensors S∗ from the setof all sensors S using the previously computed threshold α.The selected set S∗ is given as input to AdaBoost to reducethe overhead incurred by training on the entire set S. In Al-gorithm 3, we ensure that no two sensors in S∗ have a corre-lation coefficient that is greater than or equal to the thresholdα, since we demonstrate previously in Section 7.1 that cor-relation closest to 0 yields the most accurate sensor clusters.To enforce the threshold, wemaintain a set E , which containssensors that have a correlation coefficient above the thresh-old for some sensor already in S∗. For each sensor pair in S,we determine the correlation coefficient r and add pairs to S∗

where r < α. When r ≥ α for some sensor pair, we add theless accurate sensor to E as determined by Nearest Centroid[11] and the other to S∗ as long as it is not already in E .

Pair DC. We also propose two other correlation met-rics with which to select sensors: pair decision correlation(Pair DC) and cluster decision correlation (Cluster DC).Withthese two metrics, we select sensors based on classificationdecisions made by individual sensors and sensor clusters.For a sensor or cluster classifier trained using Nearest Cen-troid, we can convert each training data decision into an in-teger value as in Section 7.1 to use in computing decisioncorrelation. Sensor selection using sensor pair decision cor-relation is identical to raw data correlation, except for the useof individual sensor classifier decisions rather than raw data.

Cluster DC. For sensor cluster decision correlation, wemodify Algorithms 2 and 3. Instead of iterating through allsensor pair combinations, we first find an individual sensorclassifier with the highest accuracy using Nearest Centroidand add it to a trained sensor cluster C∗. We then incremen-

Algorithm 3 Sensor Selection Using Raw Correlation

Input: Set of all sensors S, training observations for all sen-sors O, threshold α

Output: Selected sensors S∗ to give as input to AdaBoost1: S∗ = /0

2: E = /0 // set of sensors we exclude3: for all combinations of sensors si and s j in S do4: Compute correlation coefficient r = |rOi ,O j

|5: if r < α then6: if si /∈ E then S∗ = S∗∪{si}7: if s j /∈ E then S∗ = S∗∪{s j}8: else if r ≥ α and acc(si)> acc(s j) then9: // use accuracy to decide which to add to S∗

10: if si /∈ E then S∗ = S∗∪{si}11: E = E ∪{s j}; S

∗ = S∗ \ {s j}12: else13: if s j /∈ E then S∗ = S∗∪{s j}14: E = E ∪{si}; S

∗ = S∗ \ {si}15: end if16: end for

tally add sensors to C∗ as long as the sensor has a decisioncorrelation with the cluster that is below the threshold. Thefinal cluster is then used by AdaBoost for training.

8 EvaluationWe evaluate our PBN activity recognition system using

the configuration and data collection methods described inSection 3.4, using two weeks of activity recognition data andtwo subjects, one of whom is not an author. While these twosubjects have substantially different routines and composevery different evaluation scenarios, we leave a more broadlyfocused evaluation with more subjects to future work. Wefirst evaluate classification performance with a good ini-tial training dataset in Section 8.1 and show that PBN canachieve good accuracy for even difficult to classify activi-ties. In Section 8.2, we evaluate online training with a lim-ited initial training dataset and compare our PBN retrainingdetection method to a periodic retraining approach, illustrat-ing the benefits of PBN retraining detection. We then showthe effectiveness of our sensor selection approach in Section8.3 and evaluate our PBN application performance in termsof mobile phone hardware constraints in Section 8.4.

8.1 Classification PerformanceIn Figure 12, we first compare performance for both sub-

jects in the three classification categories depicted in Table1: Environment, Posture, and Activity, showing that PBNis able to achieve high runtime accuracy with a good initialtraining dataset (100 observations per activity) along withno online training or sensor selection. Subject 2 does notperform the cycling, lying down, cleaning, or reading cate-gories, hence there is no histogram bar. In addition to totalaccuracy, we plot accuracy, precision (true positive/(true pos-itive + false positive)), and recall (true positive/(true positive+ false negative) for each activity.

Each classification category in Figure 12 has similar per-formance characteristics per subject: Subject 1 has total ac-curacies of 98%, 85%, and 90% for the Environmental, Pos-

0

0.2

0.4

0.6

0.8

1

Total Indoors Outdoors

Runtim

e A

ccura

cy

Subject 1 Subject 2

(a) Environmental class. accuracy.

0

0.2

0.4

0.6

0.8

1

Indoors Outdoors

Runtim

e P

recis

ion

Subject 1 Subject 2

(b) Environmental class. precision.

0

0.2

0.4

0.6

0.8

1

Indoors Outdoors

Runtim

e R

ecall

Subject 1 Subject 2

(c) Environmental class. recall.

0

0.2

0.4

0.6

0.8

1

Total

Walking

Sitting

Standing

Cycling

Lying

Runtim

e A

ccura

cy

Subject 1 Subject 2

(d) Posture class. accuracy.

0

0.2

0.4

0.6

0.8

1

Walking

Sitting

Standing

Cycling

LyingR

untim

e P

recis

ion

Subject 1 Subject 2

(e) Posture class. precision.

0

0.2

0.4

0.6

0.8

1

Walking

Sitting

Standing

Cycling

Lying

Runtim

e R

ecall

Subject 1 Subject 2

(f) Posture class. recall.

0

0.2

0.4

0.6

0.8

1

Total

Cleaning

Cycling

Driving

Eating

Meeting

Reading

TV Walking

Working

Ru

ntim

e A

ccu

racy

Subject 1 Subject 2

(g) Activity class. accuracy.

0

0.2

0.4

0.6

0.8

1

Cleaning

Cycling

Driving

Eating

Meeting

Reading

TV Walking

Working

Ru

ntim

e P

recis

ion

Subject 1 Subject 2

(h) Activity class. precision.

0

0.2

0.4

0.6

0.8

1

Cleaning

Cycling

Driving

Eating

Meeting

Reading

TV Walking

Working

Ru

ntim

e R

eca

ll

Subject 1 Subject 2

(i) Activity class. recall.

Figure 12: Runtime performance comparison for multiple subjects and multiple classification category sets. Subject 2 doesnot engage in the cycling, lying down, cleaning, and reading categories, so the corresponding bars are not presented.

ture, and Activity categories, while Subject 2 has respectivetotal accuracies of 81%, 82%, and 76%. Interestingly, Sub-ject 1 performs significantly better than Subject 2 since Sub-ject 2 often performs each activity less cleanly, for example,working with different lights on in a room or eating in differ-ent locations, sometimes with friends. Lastly, more complexactivities, which involve a combination of different physi-cal movements as well as external sounds and light, performreasonably well in comparison with more easily classifiedactivities. Subject 1 has over 90% accuracy for complex ac-tivities like cleaning, eating, meeting, reading, and watchingTV. Subject 2 also has over 80% accuracy for complex activ-ities like eating, meeting, and watching TV.

We present a classification timeline for Subject 1 usingthe Activity category in Figure 13, comparing PBN classifi-cations with ground truth. Here, it is apparent that activitiesinvolving movement, such as cycling, walking, and driving,are classified with few errors. More misclassifications areseen for more complex activities, such as working confusedwith meeting, cleaning, and watching TV, however, as illus-trated in Figure 12, overall accuracy for each of these com-plex activities is near or above 90%.

In Figure 14, we show the normalized AdaBoost weightsfor each sensor for Subject 1 and the Activity classifica-tion category as calculated in Equation 3. In addition to theweights for the total of all classifications, we also computethe normalized weight of correct decisions made by each

sensor for each activity using runtime data. This figure indi-cates that AdaBoost is able to select the right sensors to max-imize training and runtime accuracy and exclude 16 unhelp-ful sensors (shown in black). The figure also shows heavyreliance on the light and temperature sensors to distinguishbetween indoor and outdoor activities as well as reliance onthe accelerometers to detect activities involvingmotion, suchas walking, driving, and cycling.

8.2 Online TrainingIn this section, we demonstrate that we can use a small

training dataset, perform online training through retrainingdetection, and achieve similar accuracy as if we had a largerinitial training data set. In this section, we focus on the Ac-tivity classification category and also initialize each runtimeconfiguration with 10 random training data samples for eachactivity. To limit AdaBoost training overhead, we enforce amaximumof 100 training observations per activity since Fig-ure 7 demonstrates that this number is more than enough toachieve maximum runtime accuracy. Since we are using ran-dom initial training data, we compute the average and stan-dard deviation for 10 runs of each configuration.

Ground Truth Management. First, we investigate theretraining window size described in Section 6 for collect-ing new ground truth in Figure 15. From the figure, largerwindow sizes mean significantly less retraining and compu-tational overhead but also less runtime accuracy. We use asmaller window size of 30 new data elements per retraining

Cleaning

Cycling

Driving

Eating

Meeting

TV

Walking

Reading

Working

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

Activity

Time Interval (x10s)

PBN DecisionGround Truth

Figure 13: PBN decision and ground truth timeline for Subject 1.

TotalCleaning

CyclingDrivingEating

MeetingTV

WalkingReadingWorking

0,P

-LO

C0

,P-A

CC

-X0

,P-A

CC

-Y0

,P-A

CC

-Z1

,RS

SI

1,A

CC

-X1

,AC

C-Y

1,M

IC1

,LIG

HT

1,T

EM

P2

,RS

SI

2,A

CC

-X2

,AC

C-Y

2,M

IC2

,LIG

HT

2,T

EM

P3

,RS

SI

3,A

CC

-X3

,AC

C-Y

3,M

IC3

,LIG

HT

3,T

EM

P4

,RS

SI

4,A

CC

-X4

,AC

C-Y

4,M

IC4

,LIG

HT

4,T

EM

P5

,RS

SI

5,A

CC

-X5

,AC

C-Y

5,M

IC5

,LIG

HT

5,T

EM

P

0

0.1

0.2

0.3

No

rma

lize

d W

eig

ht

Figure 14: AdaBoost sensor weights per activity.

to balance accuracy and computational overhead. We arguethat roughly 20-40 retraining instances per subject over a twoweek period as per Figure 15 is an inconsequential burden,since the subject must only interact with the phone once perinstance to input his or her current activity.

Next, we show the ideal training data balance restric-tion δ, from Equation 4, in Figure 16. As δ approaches 2.0,runtime accuracy increases and the number of retraining in-stances decreases. We choose a δ = 2.0 to achieve high ac-curacy and fewer retraining instances, as larger δ values donot improve accuracy or reduce the number of retraining in-stances. Furthermore, since we enforce a maximum numberof training observations per activity, larger δ values will haveno effect on the balance of training data among activities.

Comparison with Periodic Retraining. In Figure 17,we compare the best retraining performancewith our retrain-ing detection and ground truth management methods to anaive retraining approach: periodic retraining. We imple-ment periodic retraining for periods of 100 to 500 intervals,with each retraining adding 30 new training data elementsto the training data set. We also choose δ = 2.0 for trainingdata balance among all activities. The figure demonstratesthat PBN with retraining detection achieves high accuracyfor both subjects while periodic retraining either has twiceas many retraining instances for the same accuracy or es-pecially in the case of Subject 2, lower accuracy for fewerretraining instances.

We also show that PBN with retraining detectionachieves similar accuracy as periodic retraining but incursfewer retraining instances. In Figure 18, we present a time-line of runtime accuracy and retraining instances for the first2500 classification intervals comparing PBN to periodic re-training every 100 classification intervals. Both approacheshave very volatile initial accuracy, but eventually converge

0

20

40

60

80

100

120

10 20 30 40 50 60 70 80 90 100 0

0.2

0.4

0.6

0.8

1

Re

tra

inin

g I

nsta

nce

s

Accu

racy

Retraining Window Size N

Subject 1 InstancesSubject 2 Instances

Subject 1 AccuracySubject 2 Accuracy

Figure 15: PBN retraining instances and runtime accuracyfor new data window sizes N = 10-100.

by 2000 intervals. The figures also demonstrate that by re-ducing the number of retraining instances, PBN retrainingdetection consequently reduces the amount of ground truthlogging by the user.

8.3 Sensor SelectionWe now evaluate our sensor selection approach in Sec-

tion 7 and demonstrate that we can exclude 30-40% of allsensors from AdaBoost training, yet achieve similar runtimeaccuracy as using no sensor selection. Removing this manysensors from training is a significant savings in computa-tional overhead since these sensors no longer have classifierstrained during each of 300 AdaBoost training iterations.

Using 10 random observations per activity as initial train-ing data and using online training, we compare each of thecorrelation metrics to determine which is the best in termsof accuracy and sensors excluded. In Figure 19, we depictthe percentage of total sensors excluded from AdaBoost (SSOnly), the percentage of total sensors excluded by sensor se-lection and AdaBoost training (SS + AdaBoost), and averageruntime accuracy for each sensor selection method.

Figure 19 indicates that both sensor pair raw correlationand decision correlation exclude 30-40% of all sensors fromtraining. Both of these methods also have nearly the same ac-curacy as without sensor selection, indicating that the rightsensors are excluded using these methods. While sensorcluster sensor selection excludes 50% of sensors from train-ing for Subject 1, accuracy is also worse. The figure alsoshows that in combinationwith AdaBoost training, each sen-sor selection method excludes more than 10% points moresensors than with no sensor selection, resulting in additional

0

50

100

150

200

250

300

0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 0

0.2

0.4

0.6

0.8

1

Re

tra

inin

g I

nsta

nce

s

Accu

racy

Retraining Balance Restriction δ



Figure 16: PBN retraining instances and runtime accuracyfor activity training data balance restriction δ.

0

20

40

60

80

100

120

NonePBN Detect

Periodic-100

Periodic-200

Periodic-300

Periodic-400

Periodic-500

0

0.2

0.4

0.6

0.8

1

Re

tra

inin

g I

nsta

nce

s

Accu

racy



Figure 17: Retraining instances and runtime accuracy with-out retraining, retraining detection, and periodic retraining.

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1

0 500 1000 1500 2000 2500

Subj. 2

Accura

cy

Time Interval (x10s)

PBN AccuracyPBN Retraining

Periodic AccuracyPeriodic Retraining

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1

0 500 1000 1500 2000 2500

Subj. 1

Accura

cy

Figure 18: Runtime timeline with accuracy andretraining instances.

0

20

40

60

80

100

None Pair DC Cluster DC Raw Corr. 0

0.2

0.4

0.6

0.8

1%

Sensors

Exclu

ded

Accura

cy

Subj. 1 SS+AdaBoostSubj. 2 SS+AdaBoost

Subj. 1 SS Only

Subj. 2 SS OnlySubj. 1 AccuracySubj. 2 Accuracy

Figure 19: Sensor selection comparison withonline training.

Figure 20: Power Con-sumption Evaluation.

energy savings since unused nodes are powered down. Wesuggest that using raw correlation is the best approach to sen-sor selection, as it requires the least computational overheadof the three metrics, yet excludes a significant number of sen-sors and maintains high runtime accuracy.

8.4 Application Performance

Table 3: PBN CPU, memory, and power benchmarks.

Mode CPU Memory Power

Idle (No PBN) <1% 4.30MB 360.59mW

Sampling (WiFi) 19% 8.16MB 517.74mW

Sampling (GPS) 21% 8.47MB 711.74mW

Sampling (WiFi) + Train 100% 9.48MB 601.02mW

Sampling (WiFi) + Classify 21% 9.45MB 513.57mW

During our experiments, the wireless motes had batterylife measured in days (4 or 5 days of 8 hour sessions), whilethe Android HTC G1 phone was unable to last more than 8hours without recharging. Here, we focus on evaluating thephone performance, for its battery lifetime is much shorter.In Table 3, we measure the CPU, memory, and power con-sumption of our PBN application to demonstrate that it ispractical for mobile phones. We run each configuration for 5minutes and show the average for system idle, sampling onlywith GPS or WiFi for localization, sampling with AdaBoosttraining, and sampling with AdaBoost classification.

Since retraining occurs infrequently, during the vast ma-jority of system deployment (Sampling + Classify), PBN in-curs roughly 20% CPU use with memory overhead under10MB. Most of this overhead is due to the TinyOS Javalibraries sending and receiving packets to and from sensormotes; we leave it to future work to further optimize theselibraries for mobile devices.

When retraining does occur, it takes between 1 and 10minutes on the HTC G1, depending on training data size.With newer hardware (HTC Nexus One), retraining time ishalved under the same conditions. Since PBN retraining isrun as a background process, it can be preempted and haslittle impact on performance of other applications, such aschecking email or making a phone call.

Depicted in Figure 20, we evaluate PBN power consump-tion with the display off using a power meter from Mon-soon Technologies. In Table 3, we demonstrate that duringsampling and classification, PBN consumes roughly 150mWin addition to system idle. This consumption is about 1/3of the additional 450mW consumed by the display when itis active. GPS-based localization consumes an additional200mW, however GPS is enabled only when WiFi localiza-tion is not possible. An additional 90mW is consumed dur-ing online training, however, as previously mentioned, theseperiods are short lived and infrequent.

9 Conclusion and Future Work

In this paper, we present PBN, a significant effort towardsa practical solution for daily activity recognition. PBN pro-vides a user-friendly experience for the wearer as well asstrong classification performance. Through integration ofAndroid and TinyOS, we provide a software and hardwaresupport system which couples the sensing power of on-bodywireless sensors with an easy to use mobile phone applica-tion. Our approach is computationally appropriate for mo-bile phones and wireless motes and also chooses the sensorsthat maximize accuracy. We improve AdaBoost through on-line training and enhanced sensor selection, analyzing theproperties of sensors and sensor data to identify helpful andredundant sensors as well as indicators for when retrainingis needed. We show in our evaluation that PBN can per-form accurately for classifying a wide array of activities andpostures even when using a limited training dataset coupledwith online training. In future work, we intend to provide anextensive usability study with a diverse array of subjects aswell as improve energy use on the phone.

10 References[1] F. Albinali, S. Intille, W. Haskell, and M. Rosenberger. Using

Wearable Activity Type Detection to Improve Physical Activ-ity Energy Expenditure Estimation. In UbiComp ’10, pages311–320. ACM, 2010.

[2] M. Azizyan, I. Constandache, and R. Choudhury. Surround-Sense: Mobile Phone Localization via Ambience Fingerprint-ing. InMobiCom ’09, pages 261–272. ACM, 2009.

[3] O. Chipara, C. Lu, T. Bailey, and G.-C. Roman. Reliable Clin-ical Monitoring using Wireless Sensor Networks: Experiencein a Step-down Hospital Unit. In SenSys ’10, pages 155–168.ACM, 2010.

[4] J. Cranshaw, E. Toch, J. Hong, A. Kittur, and N. Saleh. Bridg-ing the Gap Between Physical Location and Online SocialNetworks. In UbiComp ’10, pages 119–128. ACM, 2010.

[5] A. de Quincey. HTC Hero (MSM7201) USB Host Mode,2010. http://adq.livejournal.com/95689.html.

[6] S. B. Eisenman, E. Miluzzo, N. D. Lane, R. A. Peterson, G.-S. Ahn, and A. T. Campbell. The BikeNet Mobile SensingSystem for Cyclist Experience Mapping. In SenSys ’07, pages87–101. ACM, 2007.

[7] J. Eriksson, L. Girod, B. Hull, R. Newton, S. Madded, andH. Balakrishnan. The Pothole Patrol: Using a Mobile SensorNetwork for Road Surface Monitoring. InMobiSys ’08, pages29–39. ACM, 2008.

[8] Y. Freund and R. Schapire. A Decision-Theoretic General-ization of On-line Learning and an Application to Boosting.JCSS, 37(3):297–336, 1997.

[9] R. K. Ganti, P. Jayachandran, T. Abdelzaher, and J. Stankovic.SATIRE: A Software Architecture for Smart AtTIRE. InMo-biSys ’06, pages 110–123. ACM, 2006.

[10] H. He and E. Garcia. Learning from Imbalanced Data. IEEETKDE, 21(9):1263–1284, 2009.

[11] M. Keally, G. Zhou, G. Xing, and J. Wu. Exploiting SensingDiversity for Confident Sensing in Wireless Sensor Networks.In INFOCOM ’11, pages 1719–1727. IEEE, 2011.

[12] D. Kim, Y. Kim, D. Estrin, and M. Srivastava. SensLoc: Sens-ing Everyday Places and Paths using Less Energy. In SenSys’10, pages 43–56. ACM, 2010.

[13] S. Kullback and R. Leibler. On Information and Sufficiency.Annals of Mathematical Statistics, 22(1):79–86, 1951.

[14] J. Lester, T. Choudhury, N. Kern, G. Borriello, and B. Han-naford. A Hybrid Discriminative/Generative Approach forModeling Activities. In IJCAI, pages 766–772, 2005.

[15] K. Lorincz, B. Chen, G. Challen, A. Chowdhury, S. Patel,P. Bonato, and M. Welsh. Mercury: A Wearable Sensor Net-work Platform for High-Fidelity Motion Analysis. In SenSys’09, pages 183–196. ACM, 2009.

[16] H. Lu, W. Pan, N. Lane, T. Choudhury, and A. Camp-bell. SoundSense: Scalable Sound Sensing for People-CentricSensing Applications on Mobile Phones. In MobiSys ’09,pages 165–178. ACM, 2009.

[17] H. Lu, J. Yang, Z. Liu, N. Lane, T. Choudhury, and A. Camp-bell. The Jigsaw Continuous Sensing Engine for MobilePhone Applications. In SenSys ’10, pages 71–84. ACM, 2010.

[18] E. Miluzzo, C. Cornelius, A. Ramaswamy, T. Choudhury,Z. Liu, and A. Campbell. Darwin Phones: The Evolutionof Sensing and Inference on Mobile Phones. InMobiSys ’10,pages 5–20. ACM, 2010.

[19] E. Miluzzo, N. Lane, K. Fodor, R. Peterson, H. Lu, andM. Musolesi. Sensing Meets Mobile Social Networks: TheDesign, Implementation and Evaluation of the CenceMe Ap-plication. In SenSys ’08, pages 337–350. ACM, 2008.

[20] D. Peebles, T. Choudhury, H. Lu, N. Lane, and A. Camp-bell. Community-Guided Learning: Exploiting Mobile Sen-sor User to Model Human Behavior. In AAAI, 2010.

[21] T. Pering, P. Zhang, R. Chaudhri, Y. Anokwa, and R. Want.The PSI Board: Realizing a Phone-Centric Body Sensor Net-work. In BSN, pages 26–28, 2007.

[22] M. Quwaider and S. Biswas. Body Posture Identification us-ing Hidden Markov Model with a Wearable Sensor Network.In Bodynets ’08, pages 19:1–19:8. ICST, 2008.

[23] K. Rachuri, M. Musolesi, C. Mascolo, P. Rentfrow, C. Long-worth, and A. Acuinas. EmotionSense: A Mobile Phonesbased Adaptive Platform for Experimental Social PsychologyResearch. In UbiComp ’10, pages 281–290. ACM, 2010.

[24] Z. Ren, G. Zhou, A. Pyles, M. Keally, W. Mao, and H. Wang.BodyT2: Throughput and Time Delay Performace Assurancefor Heterogeneous BSNs. In INFOCOM ’11, pages 2750 –2758. IEEE, 2011.

[25] J. Rodgers and W. Nicewander. Thirteen Ways to Look at theCorrelation Coefficient. The American Statistician, 42(1):59–66, 1988.

[26] K. Srinivasan, M. Jain, J. Choi, T. Azim, E. Kim, P. Levis, andB. Krishnamachari. The k Factor: Inferring Protocol Perfor-mance Using Inter-Link Reception Correlation. In MobiCom’10, pages 317–328. ACM, 2010.

[27] Y. Wang, J. Lin, M. Annavaram, Q. Jacobson, J. Hong, B. Kr-ishnamachari, and N. Sadeh. A Framework of Energy Effi-cient Mobile Sensing for Automatic User State Recognition.InMobiSys ’09, pages 179–192. ACM, 2009.

[28] P. Zappi, C. Lombriser, T. Steifmeier, E. Farella, D. Roggen,L. Benini, and G. Troster. Activity Recognition from On-Body Sensors: Accuracy-Power Trade-Off by Dynamic Sen-sor Selection. In EWSN ’08, pages 17–33, 2008.

[29] R. Zhou, Y. Xiong, G. Xing, L. Sun, and J. Ma. ZiFi: Wire-less LAN Discovery via ZigBee Interference Signatures. InMobiCom ’10, pages 49–60. ACM, 2010.

[30] Z.-H. Zhou, J. Wu, and W. Tang. Ensembling Neural Net-works: Many Could Be Better Than All. Artificial Intelli-gence, 137:239–263, 2002.

[31] T. Zhu, Z. Zhong, T. He, and Z. Zhang. Exploring Link Cor-relation for Efficient Flooding in Wireless Sensor Networks.In NSDI ’10. USENIX, 2010.

Documents

PBN:TowardsPracticalActivityRecognitionUsing Smartphone ...lusu/cse726/papers/PBN Towards... · Lastly, several activity recognition methods do not pro-vide online training to account