Deep Neural Network Pruning for Efficient Edge Computing ... · Deep Neural Network Pruning for Efficient Edge Computing in IoT Rih-TengWu1 ... Molchanovet al.(2017), “Pruning Convolutional

DeepNeuralNetworkPruningforEfficientEdgeComputinginIoT

Rih-TengWu1,Ankush Singla2,MohammadR.Jahanshahi3, ElisaBertino4

1Ph.D.Student,LylesSchoolofCivilEngineering,PurdueUniversity2Ph.D.Student,DepartmentofComputerScience,PurdueUniversity

3AssistantProfessor,LylesSchoolofCivilEngineering,PurdueUniversity4Professor,DepartmentofComputerScience,PurdueUniversity

March20th,20191

Motivation– InternetofThings

2Figureadopted from:https://www.axis.com/blog/secure-insights/internet-of-things-reshaping-security/

Source:https://tinyurl.com/yagpsakm

Motivation– CurrentInspectioninSHM

3

Motivation– DeepNeuralNetworks

DeepConvolutionalNeuralNetworkforSHMØ SpecializedArchitecture?

§ NeedsalotofdataØ TransferLearning?

§ Notefficientforedgecomputing 4

NetworkPruning– InspirationfromBiology

5Figureadopted fromHongetal.(2013),”DecreasedFunctionalBrainConnectivity inAdolescentswithInternetAddiction.”

ExistingPruningAlgorithms

Ø Magnitudesoffilterweights

Ø Magnitudesofactivationvalues

Ø Mutualinformationbetweenactivationsandpredictions

Ø Regularization-basedapproaches

Ø Taylor-expansionbasedapproach

Molchanovetal. (2017),“PruningConvolutionalNeuralNetworksforResourceEfficientInference”,arXiv:1611.06440v2.

6

NetworkPruningwithFilterImportanceRankingOriginalNetwork

EvaluatetheImportanceofNeurons/Filters

RemovetheLeastImportant

Neurons/Filters

Fine-tuning

MorePruning?

StopPruning

Yes

No

FindtheleastimportantfiltersbasedonTaylor-expansion(Molchanov etal.,2017)

CrackandCorrosionDatasets

8

Non-crack(training:25313,testing:4467)Crack(training:25048,testing:4420)

Non-corrosion(training:29,026,testing:5,122)

Corrosion(training:28,083,testing:4,956)

ComputingUnits

9Serverdevice

Edgedevice

Result– TransferLearningwithoutPruning

Ø VGG16(Simonyan andZisserman,2014)

*Inferencetime:thetotaltimerequired toclassify3,720imagepatchesofsize224x224.

Simonyan andZisserman(2014),“VeryDeepConvolutionalNetworksforLarge-ScaleImageRecognition”,arXiv:1409.1556v6. 10

Result– VGG16withPruning

Ø Pruningisconductedontheserverdevice.Ø Accuracyremainsdescentafterpruningfollowedbyfine-tuning.

11

Crack Corrosion

DistributionofPrunedConvolutionKernels

Ø Early layers are pruned less, indicating the importance of low-level features.Ø Similar numbers of pruned kernels in layers between the pooling layers are

observed. 12

SensitivityAnalysis– NumberofFine-tuningEpochs

Ø Theaccuracyisnotsensitivetothenumberoffine-tuningepochsusedineachpruningiteration.

13

Crack Corrosion

SensitivityAnalysis– NumberofFine-tuningEpochs

Ø Theaccuracyisnotsensitivetothenumberoffine-tuningepochsusedineachpruningiteration.

14

Crack Corrosion

PruningTimeRequiredontheServer

Ø When using only 1 fine-tuning epoch, the total pruning time is reduced to1.5(hr), which is approximately 4.6 times faster than using 10 fine-tuningepochs. 15

Crack Corrosion

Result– ResNet18(Heetal.,2015)withPruning

16

Result– ResNet18(Heetal.,2015)withPruning

Ø Pruningisconductedontheserverdevice.Ø Accuracyremainsdescentafterpruningfollowedbyfine-tuning.Ø Pruningissensitivetothenetworkconfigurations.

17

Crack Corrosion

InferenceTimeRequiredforPrunedVGG16


Ø Server (TITANX): 13.1 (s) is reduced to 4.0 (s) for crack data; 13.2 (s) isreduced to 3.7 (s) for corrosion data. Reduction factor: 3.5

Ø Edge (TX2): 279.7 (s) is reduced to 31.6 (s) for crack data; 275.7 (s) is reducedto 30.6 (s) for corrosion data. Reduction factor: 9

18

Crack Corrosion

InferenceTimeonEdgeDevice:VGG16VSResNet18


Ø Inferencetime

Ø VGG16:279.7(s)to31.6(s);reductionfactor:8.9

Ø ResNet18:36.8(s)to8.9(s);reductionfactor:4.1Ø Memory:

Ø VGG16:525(MB)to125(MB),80%reduction

Ø ResNet18:44(MB)to2 (MB),95%reduction 19

Crack Corrosion

Five-foldCrossValidationTestonVGG16

Ø Meanaccuracyof5-foldcrossvalidationtestisconductedonserver.Ø Networkfine-tuningisnecessarytoenhancetheaccuracy.

20

Crack Corrosion

Five-foldCrossValidationTestonVGG16(Cont.)

Ø The variance in the accuracy after fine-tuning is very small. However,when pruning 97% of the filters, the variance increases and theaccuracy after fine-tuning drops.

Ø The pruning is stopped when the accuracy after fine-tuning drops morethan 3%. 21

Crack Corrosion




Crack Corrosion




Crack Corrosion




Crack Corrosion

SummaryØ Network pruning combined with transfer learning can achieve efficient

inference when there is limited training data and computing power.

Ø By network pruning, the inference time on edge device is nine and fourtimes faster than the original VGG16 and ResNet18. The network size isreduced by 80% and 95% for the VGG16 and ResNet18 networks,respectively.

Ø Different network configurations exhibit different behaviors with respectto pruning.

Ø Sensitive analysis shows that pruning can be achieved by using a smallernumber of fine-tuningwithout losing detection performance.

Ø The computation gain on the edge device is more prominent than the gainon the server device.

25

Thankyou

26

Documents

Deep Neural Network Pruning for Efficient Edge Computing ... · Deep Neural Network Pruning for Efficient Edge Computing in IoT Rih-TengWu1 ... Molchanovet al.(2017), “Pruning Convolutional