Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
DeepNeuralNetworkPruningforEfficientEdgeComputinginIoT
Rih-TengWu1,Ankush Singla2,MohammadR.Jahanshahi3, ElisaBertino4
1Ph.D.Student,LylesSchoolofCivilEngineering,PurdueUniversity2Ph.D.Student,DepartmentofComputerScience,PurdueUniversity
3AssistantProfessor,LylesSchoolofCivilEngineering,PurdueUniversity4Professor,DepartmentofComputerScience,PurdueUniversity
March20th,20191
Motivation– InternetofThings
2Figureadopted from:https://www.axis.com/blog/secure-insights/internet-of-things-reshaping-security/
Source:https://tinyurl.com/yagpsakm
Motivation– CurrentInspectioninSHM
3
Motivation– DeepNeuralNetworks
DeepConvolutionalNeuralNetworkforSHMØ SpecializedArchitecture?
§ NeedsalotofdataØ TransferLearning?
§ Notefficientforedgecomputing 4
NetworkPruning– InspirationfromBiology
5Figureadopted fromHongetal.(2013),”DecreasedFunctionalBrainConnectivity inAdolescentswithInternetAddiction.”
ExistingPruningAlgorithms
Ø Magnitudesoffilterweights
Ø Magnitudesofactivationvalues
Ø Mutualinformationbetweenactivationsandpredictions
Ø Regularization-basedapproaches
Ø Taylor-expansionbasedapproach
Molchanovetal. (2017),“PruningConvolutionalNeuralNetworksforResourceEfficientInference”,arXiv:1611.06440v2.
6
NetworkPruningwithFilterImportanceRankingOriginalNetwork
EvaluatetheImportanceofNeurons/Filters
RemovetheLeastImportant
Neurons/Filters
Fine-tuning
MorePruning?
StopPruning
Yes
No
FindtheleastimportantfiltersbasedonTaylor-expansion(Molchanov etal.,2017)
CrackandCorrosionDatasets
8
Non-crack(training:25313,testing:4467)Crack(training:25048,testing:4420)
Non-corrosion(training:29,026,testing:5,122)
Corrosion(training:28,083,testing:4,956)
ComputingUnits
9Serverdevice
Edgedevice
Result– TransferLearningwithoutPruning
Ø VGG16(Simonyan andZisserman,2014)
*Inferencetime:thetotaltimerequired toclassify3,720imagepatchesofsize224x224.
Simonyan andZisserman(2014),“VeryDeepConvolutionalNetworksforLarge-ScaleImageRecognition”,arXiv:1409.1556v6. 10
Result– VGG16withPruning
Ø Pruningisconductedontheserverdevice.Ø Accuracyremainsdescentafterpruningfollowedbyfine-tuning.
11
Crack Corrosion
DistributionofPrunedConvolutionKernels
Ø Early layers are pruned less, indicating the importance of low-level features.Ø Similar numbers of pruned kernels in layers between the pooling layers are
observed. 12
SensitivityAnalysis– NumberofFine-tuningEpochs
Ø Theaccuracyisnotsensitivetothenumberoffine-tuningepochsusedineachpruningiteration.
13
Crack Corrosion
SensitivityAnalysis– NumberofFine-tuningEpochs
Ø Theaccuracyisnotsensitivetothenumberoffine-tuningepochsusedineachpruningiteration.
14
Crack Corrosion
PruningTimeRequiredontheServer
Ø When using only 1 fine-tuning epoch, the total pruning time is reduced to1.5(hr), which is approximately 4.6 times faster than using 10 fine-tuningepochs. 15
Crack Corrosion
Result– ResNet18(Heetal.,2015)withPruning
16
Result– ResNet18(Heetal.,2015)withPruning
Ø Pruningisconductedontheserverdevice.Ø Accuracyremainsdescentafterpruningfollowedbyfine-tuning.Ø Pruningissensitivetothenetworkconfigurations.
17
Crack Corrosion
InferenceTimeRequiredforPrunedVGG16
*Inferencetime:thetotaltimerequired toclassify3,720imagepatchesofsize224x224.
Ø Server (TITANX): 13.1 (s) is reduced to 4.0 (s) for crack data; 13.2 (s) isreduced to 3.7 (s) for corrosion data. Reduction factor: 3.5
Ø Edge (TX2): 279.7 (s) is reduced to 31.6 (s) for crack data; 275.7 (s) is reducedto 30.6 (s) for corrosion data. Reduction factor: 9
18
Crack Corrosion
InferenceTimeonEdgeDevice:VGG16VSResNet18
*Inferencetime:thetotaltimerequired toclassify3,720imagepatchesofsize224x224.
Ø Inferencetime
Ø VGG16:279.7(s)to31.6(s);reductionfactor:8.9
Ø ResNet18:36.8(s)to8.9(s);reductionfactor:4.1Ø Memory:
Ø VGG16:525(MB)to125(MB),80%reduction
Ø ResNet18:44(MB)to2 (MB),95%reduction 19
Crack Corrosion
Five-foldCrossValidationTestonVGG16
Ø Meanaccuracyof5-foldcrossvalidationtestisconductedonserver.Ø Networkfine-tuningisnecessarytoenhancetheaccuracy.
20
Crack Corrosion
Five-foldCrossValidationTestonVGG16(Cont.)
Ø The variance in the accuracy after fine-tuning is very small. However,when pruning 97% of the filters, the variance increases and theaccuracy after fine-tuning drops.
Ø The pruning is stopped when the accuracy after fine-tuning drops morethan 3%. 21
Crack Corrosion
Five-foldCrossValidationTestonVGG16(Cont.)
Ø The variance in the accuracy after fine-tuning is very small. However,when pruning 97% of the filters, the variance increases and theaccuracy after fine-tuning drops.
Ø The pruning is stopped when the accuracy after fine-tuning drops morethan 3%. 22
Crack Corrosion
Five-foldCrossValidationTestonVGG16(Cont.)
Ø The variance in the accuracy after fine-tuning is very small. However,when pruning 97% of the filters, the variance increases and theaccuracy after fine-tuning drops.
Ø The pruning is stopped when the accuracy after fine-tuning drops morethan 3%. 23
Crack Corrosion
Five-foldCrossValidationTestonVGG16(Cont.)
Ø The variance in the accuracy after fine-tuning is very small. However,when pruning 97% of the filters, the variance increases and theaccuracy after fine-tuning drops.
Ø The pruning is stopped when the accuracy after fine-tuning drops morethan 3%. 24
Crack Corrosion
SummaryØ Network pruning combined with transfer learning can achieve efficient
inference when there is limited training data and computing power.
Ø By network pruning, the inference time on edge device is nine and fourtimes faster than the original VGG16 and ResNet18. The network size isreduced by 80% and 95% for the VGG16 and ResNet18 networks,respectively.
Ø Different network configurations exhibit different behaviors with respectto pruning.
Ø Sensitive analysis shows that pruning can be achieved by using a smallernumber of fine-tuningwithout losing detection performance.
Ø The computation gain on the edge device is more prominent than the gainon the server device.
25
Thankyou
26