Detecting activity patterns on accelerometer data: a Deep ... · •The LISS (Longitudinal Internet Studies for the Social sciences, maintained by CentERdata) panel consists of about

Detectingactivitypatternsonaccelerometerdata:aDeepLearning

approachusingGoogleCloudKai-TaoYang

DataScienceTeamatCentERdataGDGDevFest NL

November18th,2017,Amsterdam,Netherlands

2

ContactInformation:Email:ykaitao.hotmail.comLinkedIn:https://www.linkedin.com/in/kaitaoyang/GitHub:https://github.com/ykaitaoWebsite:www.dlapplied.com

Collaborators:LennardKuijtenPradeepKumarMarciadenUijlPatriciaPrüferEricBalster

3

Outline

• What istheproblem?• Why isitimportant?• Howwastheproblemsolved?• What arethefuturedirections?

4

Outline

• What istheproblem?ØAccelerometerdeviceØAccelerometerdataØAccelerometerdatainLISSpanel

• Why isitimportant?• Howwastheproblemsolved?• What arethefuturedirections?

5

Accelerometerdevice(GENEActiv)

Moreinformation:https://www.activinsights.com/products/geneactiv/6

Accelerometerrecording

7

Accelerometerrecording

OfficeTilburgUniversiteit

stationEindhovenstation Home

8

walking sittingonthetrain walkingcycling

walking

sittingonthetrain

cycling

AccelerometerdatainLISSpanel

• In2013,805accelerometerrecordingswerecollectedfrom805LISSpanelmembers.Eachrecordinghasthelengthfrom10to13days,atthesamplingrateof60Hz(eachrecordingisabout3Gb).

• TheLISS(LongitudinalInternetStudiesfortheSocialsciences,maintainedbyCentERdata)panelconsistsofabout5000households,comprising8000individuals,basedonatrueprobabilitysampleofhouseholdsdrawnfromthepopulationregisterbyStatisticsNetherlands.

9

AccelerometerdatainLISSpanel(continued)

• Onamonthlybasis,LISSpanelmemberscompleteonlinequestionnaires,resultinginrichdatalikegender,age,income,livingcondition,educationlevel,healthstatus,politicalview,etc.

• AllLISSpaneldataarepubliclyavailableforresearchpurposeonly,throughtheLISSdataarchive:https://www.dataarchive.lissdata.nl/.

10

Outline

• What istheproblem?• Why isitimportant?

ØGoalsofthisstudy.ØAdvantagesofaccelerometermeasurement.

• Howwastheproblemsolved?• What arethefuturedirections?

11

Importanceofthisstudy

• Goalsofthisstudy:• Detect activitypatternsfromtheaccelerometerdataoftheLISSpanel.• Find therelationshipbetweenactivitypatternsandbackgroundvariables(e.g.,thejoggingpatternandthehealthstatus).

• Advantagesofaccelerometermeasurement• Moreobjective thenquestionnaire.• Non-invasive ofprivacyandlessexpensivethanvideosurveillance.

12

Outline

• What istheproblem?• Why isitimportant?• Howwastheproblemsolved?

ØApproachesintheliterature§ Conventionalapproaches§ Deeplearningapproaches

ØOurapproach

• What arethefuturedirections?

13

Approachesintheliterature(conventional)

14Figureadoptedfrom:Wangetal.DeepLearningforSensor-basedActivityRecognition:ASurvey,PatternRecognitionLetters,2017

Approachesintheliterature(deeplearning)

15Figureadoptedfrom:Wangetal.DeepLearningforSensor-basedActivityRecognition:ASurvey,PatternRecognitionLetters,2017

CNN:ConvolutionalNeuralNetworkDNN:DeepNeuralNetworkDBN:DeepBeliefNetworkRNN:RecurrentNeuralNetworkLSTM:Long-ShortTermMemorySdA:Stackedde-noisingAuto-encoder

SummationofDeepLearningapproaches

16

CNN

Tableadoptedfrom:Wangetal.DeepLearningforSensor-basedActivityRecognition:ASurvey,PatternRecognitionLetters,2017

CNNincomputervisioncommunity

Name Year Layers Honor Achievements

LeNet 1990 5 ThefirstsuccessfulapplicationsofCNN(toreadzipcodes,digits,etc.).

AlexNet 2012 8 WinnerinILSVRC ThefirstworkthatpopularizedConvolutionalNetworksinComputerVision(GPUsNVIDIAGTX580toreducetrainingtime).Top5errorof16%comparedtorunner-upwith26%error.

ZFNet 2013 8 WinnerinILSVRC Top-5errorrateof 11.2%.

VGGNet 2014 19 Runner-upinILSVRC Top-5errorrateof 7.5% forVGGNet-19.

GoogLeNet(InceptionV2, V3,V4)

2014 22 WinnerinILSVRC Top-5errorrateof 6.67%,5.6%,5.0%forV2,V3,V4,respectively[1].

ResNet 2015 152 WinnerinILSVRC Top-5errorrateof 3.57%forResNet-152.

17ILSVRC=ImageNetLargeScaleVisualRecognitionCompetition

Outline


ØApproachesintheliteratureØOurapproach

§ Overviewo Preparingtrainingdatao Buildingmodelso Trainingmodelso Testingmodels

§ Details


18

Ourapproach (overview)

• Sixactivities:• cycling,• drivingcar,• jogging,• sittingonthetrain,• sleeping,• walking

• WeadaptedourmodelfromVGGNet-19[1].• Trained3models(ofdifferentsegmentlengths:about5,10,20seconds)usingtheGUPsofDutchsupercomputer.• Testedmodelson805LISSpanelrecordingsusingGoogleCloudservices.

19[1]https://github.com/fchollet/keras/blob/master/keras/applications/vgg19.py

AdaptingourmodelfromVGGNet-19

20

TrainingusingDutchsupercomputer(Cartesius)

#!/bin/bash#SBATCH-N1#SBATCH-t120:00:00#SBATCH-pgpumoduleunloadmpimoduleloadmpi/openmpi/2.0.1-cuda80moduleloadcuda/8.0.44moduleloadcudnn/8.0-v5.1moduleloadpython/2.7.11srunpythonacc_keras_vgg19_small.pyrelu50.1

21

Job.sh

Numberofnode

Expectedexecutingtime(e.g.,120hours)

Partition(e.g.,gpu)

Unloadmodule

Loadmodule

ExecutePythonapplication

Trainedmodel(segmentlength=320samples)

22


23


24

TestingusingGooglecloudservices

25

26

27

28

Outline


ØApproachesintheliteratureØOurapproach

§ Overview§ Details

o NeuronandNeuralNetworko ConvolutionalNeuralNetworkandVGGNeto AdaptingVGGNet-19forourapplicationo Datapre-preparation


29

Neuron(linear)

30

x

y

y=xk+b

0

b∑

bk

x y

y=xk+b

Neuralnetwork(linear)

31

∑

b1w11x1

y1

∑

b2x2

y2

x3

w12

w21

w22

w31

w32

y=xW+b

wherex=[x1,x2,x3]

w11,w12W=w21,w22

w31,w32

b=[b1,b2]y=[y1,y2]

Inputlayer Outputlayer

Neuralnetwork(linear)

32

∑

b1w11x1

y1

∑

b2x2

y2

x3

w12

w21

w22

w31

w32

y=xW+bwherexis1-by-myis1-by-nwism-by-nbis1-by-n


Neuralnetwork(non-linear)

33

∑

b1w11x1

y1

∑

b2x2

y2

x3

w12

w21

w22

w31

w32

y=ƒ(xW+b)

ƒ

ƒ

Activationfunction:sigmoid

𝑓 𝑥 =1

1 + 𝑒'(


Neuralnetwork(non-linear,multi-layers)

34

∑

bhWihx h

∑

h= ƒ(xWih+bh)y= ƒ(hWho+bo)y= ƒ(ƒ(xWih+bh)Who+bo)

ƒ

ƒ

∑

bo y

∑

ƒ

ƒ

Who

Inputlayer Hiddenlayer Outputlayer

Vanishinggradients

35

WhatisCNN?

36Figureadoptedfrom:https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/

Convolution(2D)

37Figureadoptedfrom:http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

Pooling(2D)

38

0.5 0.3

0.4 0.8

0.8

0.5

MaxPooling

AveragePooling

VGGNet

39

Ourapproach(detailed)kernels

40Codeadaptedfrom:https://github.com/fchollet/keras/blob/master/keras/applications/vgg19.py

Thinanddeep

Ourapproach(detailed)activationfunction

41Moreoptionsaboutactivation:https://keras.io/activations/

Sigmoidinourmodel

Ourapproach(detailed)activationfunction

42

𝜎 𝑦+ =𝑒,-

∑ 𝑒,/0123

ReLU

Sigmoid

Softmax

𝜎 𝑦+ =1

1 + 𝑒',-

𝜎 𝑦+ = 4𝑦+, 𝑓𝑜𝑟𝑦+ > 00, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

𝑦+

𝜎 𝑦+

0

𝜎 𝑦+ =𝑦+

𝑦+

𝜎 𝑦+

0

0.51

𝑦+

𝜎@ 𝑦+

0

0.51

𝑦+

𝜎@ 𝑦+

0

0.25

Vanishinggradients

43

y= ƒ(ƒ(xWih+bh)Who+bo)

Ourapproach(detailed)lossfunction

44

cycling drivingcar

jogging sittingonthetrain

sleeping walking

1 0 0 0 0 0

0 0 0 0 1 0

0 0 1 0 0 0

0 0 0 0 0 1

0 0 0 1 0 0

0 1 0 0 0 0

Moreoptionsaboutlossfunctions:https://keras.io/losses/

cycling drivingcar

jogging sittingonthetrain

sleeping walking

0.98 0.004 0.001 0.019 0.0016 0.0114

0.0019 0.095 0.0044 0.052 0.96 0.051

0.087 0.058 0.93 0.036 0.0034 0.0047

0.028 0.004 0.003 0.037 0.054 0.97

0.012 0.00084 0.0048 0.95 0.021 0.0093

0.003 0.94 0.00025 0.0089 0.0082 0.076

Predictedprobabilities Trueprobabilities

categorical_crossentropy:maximizingthevaluesingreencells.binary_crossentropy:maximizingthevaluesingreencells,andminimizingthevaluesinyellowcells.

Ourapproach(detailed)datapre-processing,channelmerging

45

𝑥A+𝑦A+𝑧A-1

Ourapproach(detailed)datapre-processing,streaming

while(notterminate):[X,Y]=get_batch(dir_name,class_names,seg_len=seg_len)loss=model.train_on_batch(X,Y)

46

Ourapproach(detailed)datapre-processing,randomshuffling

47

Withoutrandomshuffling Withrandomshuffling

Reddotspresentsthedatapointsselectedineachbatch.

Ourapproach(detailed)datapre-processing,augmentation

48

Walking Walking

Augmentation

Outline

• What istheproblem?• Why isitimportant?• Howwastheproblemsolved?• What arethefuturedirections?

ØGetmorelabeleddataØModelØLearningstrategiesØApplications

49

Futuredirections

• Getmorelabeleddata• Publiclyavailabledata.• Crowd-sourcing:takeadvantageofthecrowdtoannotatetheunlabeledactivities.

• Model• Learningstrategies• Applications

50

Futuredirections

• Getmorelabeleddata• Model• ImplementotherCNNmodels(e.g.,GoogLeNet,ResNet,Xception,andMobileNet)• Light-weightdeepmodels.• Fine-tunehyper-parameters(e.g.,numberoflayers,sizeofkernels,activationfunctions,lossfunctions).

• Learningstrategies• Applications

51

Futuredirections

• Getmorelabeleddata• Model• Learningstrategies• Activelearning.• Incrementallearning.

• Applications

52

Futuredirections

• Getmorelabeleddata• Model• Learningstrategies• Applications• Assistant:computingsystemsareawareoftheactivitiesoftheuser,suchthattheycanproactivelyassisttheuser.

53

Futuredirections

• Getmorelabeleddata• Publiclyavailabledata.• Crowd-sourcing:takeadvantageofthecrowdtoannotatetheunlabeledactivities.

• Model• ImplementotherCNNmodels(e.g.,GoogLeNet,ResNet,Xception,andMobileNet)• Light-weightdeepmodels.• Fine-tunehyper-parameters(e.g.,numberoflayers,sizeofkernels,activationfunctions,lossfunctions).

• Learningstrategies• Activelearning.• Incrementallearning.

• Applications• Assistant:computingsystemsareawareoftheactivitiesoftheuser,suchthatthecomputingsystemscanproactivelyassistusers.

54

ThankYou!

Takehomemessages:CNNs(e.g.,LeNet,AlexNet,VGGNet,GoogLeNet,ResNet,Xception,andMobileNet)for2Ddatacanbeeasilymodifiedtoprocess1Ddata.LISSpaneldata(richinformation,publiclyavailableforresearchpurpose);GPUcomputing;Dutchsupercomputer;Googlecloudservices;Activationfunction(ReLU,Softmax,Sigmoid);Lossfunction(categorical_crossentropy,binary_crossentropy);Datapre-processing(streaming,randomshuffling,augmentation).

ContactInformation:Email:ykaitao.Hotmail.comLinkedIn:https://www.linkedin.com/in/kaitaoyang/GitHub:https://github.com/ykaitao

55

Documents

Detecting activity patterns on accelerometer data: a Deep ... · •The LISS (Longitudinal Internet Studies for the Social sciences, maintained by CentERdata) panel consists of about