Download pdf - Deep Learning Working Group Statement of Work - The … · · 2016-10-04Deep Learning Working Group Statement of Work August, 2016 ... , MediaTek, Cadence, Axis, Qualcomm ... Microsoft

DeepLearningWorkingGroupStatementofWork

August,2016

IndustryRequirementDeeplearningisarapidlyevolvingfieldinmodernsignalprocessing,withdiverseapplicationsincluding object detection, gesture tracking, advanced driver assistance, to name a few.Several institutesandcompanieshavereleasedtheirdeeplearningtoolkitstothepublicforopenuse,suchasCaffe,TensorFlow,Theano,Torch,ChainerandCNTK–mostlyontheserverside.Manymodernclientdeviceswouldalsobecapableofcarryingoutsuchcomputations,whereapplications often require deep learning algorithms to work in real-time, processing dataobtainedfromsensorsonthefly.Severalhardwarevendorshavedevelopedaccelerateddeeplearninglibrariesfortheirclientside products.Whilemost of these companies develop their own APIs, they have to buildproprietarydata import interfacestobeableto leverageresultsachieved intheserversideframeworks.Tocomplicatetheproblem,thetransferofdatastructuresfromoneserversideframeworktoanotherisalsonotstraightforward.An open standard that defines common data exchange mechanism would aid thedevelopment process from both the viewpoint of hardware vendors and theframework/applicationdevelopersincreatingcross-platformdeeplearningapplications.ProposedSolutionandImpactOur proposal is to create an open, royalty-free, API independent standard file format forexchangingdeeplearningdata.Thestandardwouldspecifyneuralnetworkstructureanddataexchange formats and commonly used operations and their formal semantics for cross-compatibility between frameworks and to facilitate deployment from frameworks toembeddedsystems.Forexample,modelstrainedonaserversystemcouldberunonamobiledevice, ensuring equivalent computation. The standard would allow hardware vendors toimplement optimized and competitive deep learning inference engines. Hardware vendorsthat comply to the standard and provide import functionality of the file formatwould notneed to work on compatibility with frameworks, but could achieve compatibility with allframeworks throughthestandard.At thesametime, frameworkdevelopers thatcomply tothe standard and provide export to the file format would ensure that their result iscompatiblewithvariousembeddeddevices.Executing trainedneuralnetworksondifferenthardwareplatformswouldbecomeeasier.DesignPhilosophyTheDeepLearningworkinggroupmayconsiderusingandrefiningthefollowingdesigngoalsandprinciplesfortheproposedstandard:

• Keep the standard independent of frameworks and implementing APIs, ensure thatthedataformatisportableacrossmultipleimplementinginferenceengines

• Keep the primary focus on enabling neural network inference on mobile andembeddeddevices

• Ensurethatefficientexecutionoftheneuralnetworksdescribedbythefileformatispossibleonawidevarietyofcomputingdevices, includingCPUs,GPUs,DSPs,FPGAsanddedicatedaccelerators

• Ensure that it is possible for one or more Khronos groups (such as OpenVX andOpenCL)toprovideefficientconformantruntimeimplementations

• EnsurethatefficientimportingofthedataformatispossibleonKhronosAPIs• Aim to design the standard to be future proof in the sense that it would possibly

accommodatenewlyemergingneuralnetworkarchitectureswithminimalextension• Keepthescopeofthefirstversionfocusedonoperationsandalgorithmsrequiredfor

state-of-the-artdeep learningwith convolutionalneuralnetworks, at the same timebeingopenendedtosupportawiderangeofneuralarchitecturesthatarebuiltfromthesamecomponents (suchasMLPs,RNNs,LSTMs,RBMs,Auto-Encoders,Encoder-Decoderarchitectures)

• Focus on the main use case of exporting neural networks trained in public deeplearningframeworksandimportingthemtohardwarevendorAPIs,atthesametimebeopenendedtoletthestandardbeusefulfordatainterchangeamongframeworks

• Ensure that the file format is prepared for varying precision requirements from thehardwarevendors’side,andispreparedforvendorspecificextensions

TheDeepLearningworkinggroupmayconsiderrefiningandextendingthefollowingaspectsofanAPIindependentDeepLearningstandard

• Theuseofacomputationalgraphapproachtocharacterizeneuralnetworkstructure• Enabling the expression of operations at multiple granularities to facilitate the

possibilityofoptimizedexecutiononvarioushardware• The formal specification of a set of standardized basic operations required for the

expressionofvariousdeeplearningalgorithms• A textual format that facilitates the description and interchange of computational

graphs(neuralnetworkstructure),usingtheabovementionedoperations• Asetofstandardizedcompoundoperationsthatarebuiltfromtheabovementioned

basic operationsusing the abovementioned textual description format, to facilitatetheexpressionofcomputationsatmultiplegranularities

• Amultidimensionalarraydatastructurethattakesvaryingprecisionrequirementsintoaccountforinteroperabledataexchange

• Abinarydataformat(file)usingtheabovementionedmultidimensionalarrayformatforinterchangingneuralnetworkdata(weights)

DeliverablesTheDeepLearningworkinggroupwillproducethefollowingdeliverables:

• Specificationof theAPI independentdata file formatandthe formalspecificationoftheoperationscontainedwithin

• Conformancetestforthestandardo Fileformatvalidationo Functional conformance that ensures implementability and output that

matchestheformalspecificationupto(possiblymultiple)precisionlevel(s)bymultipleAPIs

IndustrySupportIt is expected that hardware vendors adopt the standard to deliver efficient deep learningacceleration, and that companies use it for creating high-level algorithms, libraries andapplications. Several companieshavealreadyexpressed their interest includingAdasWorks,AMD,MediaTek,Cadence,Axis,Qualcomm,Intel,NXPandMovidius.Wewouldexpectstronginterestamongcompanies,activeindeeplearningspecifichardwaredesign,tojoinKhronostohelpdefinethisspecification.Atthesametime,supportfromframeworkvendorsisalsorequired.Itisknowntousthattheutilityofastandardtoexchangeneuralnetworkdataamongframeworkshavealreadybeenconsideredbythedeeplearningcommunity.Itmustbeensuredthatnoworkisduplicatedinthisarea,soitisimportanttoreachouttoframeworkvendorstoaskabouttheirinterestandfortheirinput.A Preliminary Call for Participation summarizing the approach and design principles in thisdocument was circulated among deep learning framework developers to ask them forfeedback. All major frameworks responded supporting the idea of creating a standardizedexchange format for neural networks. Themost important advice we received is that theformat has to be general enough to support future networkmodels and computations toavoidbecomingobsoleteandthatithastosupportthedescriptionofoperationsonmultiplelevelsofgranularityinordertofacilitateoptimizationsonvarioushardware.It is expected that framework developers implement export functionalities to the standardexchange format, although their cooperation is not strictly required, since third partyconversiontoolsmayalsobedeveloped.KhronosInfrastructureThis is expected to be a typically-organized, medium-sized working group, between 10-20members.Efficientcommunicationandcollaborationwith theOpenVXKhronosgroupmustbe established in order to avoid duplication of work among the two groups. The OpenVXgroupwouldworkonanAPIofaninferenceenginethatcanutilizethefileformatsdefinedbytheDeepLearninggroup,andcompatibilityshouldbeensured.MilestonesIt issuggestedthattheDeepLearningworkinggrouprefinethefollowingmilestoneplanasdiscussionsdevelop(startingaftertheSoWisaccepted,expectedtobefinishedbyCVPRJune2017):

1. 6months–Agreeonthedescriptionmethodsusedbytheexchangeformata. Formattodescribenetworkstructureb. Formattodescribenetworkweights

2. 9months–DefinitionoftheAPIindependentstandarda. Specificationofdataformats,operations,fileformatsb. DefinitionofitsrelationtoandcompatibilitywithOpenVXc. Specificationofconformancetests

3. 9months–Implementationoftheconformancetestsa. Validationtestsb. Functionalconformancetests