20
Original Paper Asthma Exacerbation Prediction and Interpretation based on Time-sensitive Attentive Neural Network: A Retrospective Cohort Study 1 Yang Xiang, Ph.D., [email protected] 1,2 Hangyu Ji, M.D., [email protected] 1 Yujia Zhou, M.B.B.S., [email protected] 1 Fang Li, Ph.D., [email protected] 1 Jingcheng Du, Ph.D., [email protected] 1 Laila Rasmy, M.S., [email protected] 1 Stephen Wu, Ph.D., [email protected] 1 Wenjin.Jim Zheng, Ph.D., [email protected] 1 Hua Xu, Ph.D., [email protected] 1 Degui Zhi, Ph.D., [email protected] 1 Yaoyun Zhang, Ph.D., [email protected] 1 Cui Tao*, Ph.D., [email protected] 1 School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, U.S. 2 Division of Gastroenterology, Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing, China *corresponding author . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review) (which The copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161 doi: medRxiv preprint NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

Original Paper

Asthma Exacerbation Prediction and Interpretation based on Time-sensitive Attentive Neural Network: A Retrospective Cohort Study

1YangXiang,Ph.D.,[email protected]

1,2HangyuJi,M.D.,[email protected]

1YujiaZhou,M.B.B.S.,[email protected]

1FangLi,Ph.D.,[email protected]

1JingchengDu,Ph.D.,[email protected]

1LailaRasmy,M.S.,[email protected]

1StephenWu,Ph.D.,[email protected]

1Wenjin.JimZheng,Ph.D.,[email protected]

1HuaXu,Ph.D.,[email protected]

1DeguiZhi,Ph.D.,[email protected]

1YaoyunZhang,Ph.D.,[email protected]

1CuiTao*,Ph.D.,[email protected]

1SchoolofBiomedicalInformatics,TheUniversityofTexasHealthScienceCenterat

Houston,Houston,TX,U.S.

2DivisionofGastroenterology,Guang'anmenHospital,ChinaAcademyofChinese

MedicalSciences,Beijing,China

*correspondingauthor

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

Page 2: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

Abstract Background:Asthmaexacerbationisanacuteorsub-acuteepisodeofprogressiveworseningofasthmasymptomsandcanhavesignificantimpactsonpatients’dailylife.In2016,12.4millioncurrentasthmatics(46.9%)intheU.S.hadatleastoneasthmaexacerbationinthepreviousyear.Objective:Theobjectivesofthisstudyweretopredicttheriskofasthmaexacerbationsandtoexplorepotentialriskfactorsinvolvedinprogressiveasthma.Methods:Weproposedatime-sensitiveattentiveneuralnetworktopredictasthmaexacerbationusingclinicalvariablesfromelectronichealthrecords(EHRs).TheclinicalvariableswerecollectedfromtheCernerHealthFacts®databasebetween1992and2015including31,433asthmaticadultpatients.Interpretationsonboththepatientlevelandthecohortlevelwereinvestigatedbasedonthemodelparameters.Results:TheproposedmodelobtainsanAUCvalueof0.7003through5-foldcross-validation,whichoutperformsthebaselinemethods.Theresultsalsodemonstratethattheadditionofelapsedtimeembeddingsconsiderablyimprovestheperformanceonthisdataset.Throughfurtheranalysis,itwaswitnessedthatriskfactorsbehaveddistinctlyalongthetimelineandacrosspatients.Wealsofoundsupportingevidencefrompeer-reviewedliteratureforsomepossiblecohort-levelriskfactorssuchasrespiratorysyndromesandesophagealreflux.Conclusions:Theproposedtime-sensitiveattentiveneuralnetworkissuperiortotraditionalmachinelearningmethodsandperformsbetterthanstate-of-the-artdeeplearningmethodsinrealizingeffectivepredictivemodelsforthepredictionofasthmaexacerbation.Webelievethattheinterpretationandvisualizationofriskfactorscanhelptheclinicalcommunitytobetterunderstandtheunderlyingmechanismsofthediseaseprogression.Keywords:asthmaexacerbation;predictivemodel;time-sensitive;elapsedtimeembedding;deeplearning;attentionmechanism

Introduction Asthmaisacommonandserioushealthproblemwhichaffects235millionpeopleworldwide[1]andanestimatedof26.5millionpeople(8.3%oftheU.S.population)intheU.S.[2].Asthmatakesasignificanttollonthepopulationwhichimposesanunacceptableburdenonhealthcaresystemswithatotalannualcostof$81.9billionin2013intheU.S.[3,4].Asthmamaydevelopintoexacerbationifitisnotwellcontrolledorstimulatedbyspecificriskfactors[3].In2016,12.4millioncurrentasthmatics(46.9%)intheU.S.hadatleastoneasthmaexacerbationinthepreviousyear[2].Exacerbationsofasthmacanbesevereandrequireimmediatemedicalinterventions,eitherasanemergencydepartmentvisitoranadmissiontohospital[5].Seriousasthmaexacerbationsmayevenresultindeath[6].Therefore,itisofpracticalsignificancetomakeearlypredictionssothatinterventionscanbecarriedoutinadvancetoreducetheprobabilityofexacerbation.

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 3: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

Investigationsonpredictionandriskfactorrecognitionforasthmaexacerbationhavebeenrespectable,inwhichthemainstreamadoptstraditionalstatisticalmethods,suchaslogisticregression[7,8],proportional-hazardsregression[9],andgeneralizedlinearmixedmodels[10].However,mostofthemhaveonlyexploredasmallgroupofcandidateriskfactors,areusuallyhardtoextendtootherdatasetsandhardtomakepersonalizedpredictions.Withtheexplosionofhealthcaredatainrecentyears,machinelearningmethodshavegrowninprominenceforthisdomain,duetotheirsuperiorityoverstatisticmethodsinprocessinglargernumbersofvariablesandcapacityinminingmorepossiblecorrelationsbetweenthem[11].TypicalmodelsincludeNaïveBayes[13],Bayesiannetworks[12–14],artificialneuralnetworks[12],GaussianProcess[12],andSupportVectorMachines[12,13].Althoughdifferentattemptshavebeenmade,therearestillseveraldeficienciesinapplyingthesetraditionalmachinelearningmethods.Forexample,ignoringtemporaldependenciesbetweenvariablesmightnotprovideameaningfulriskestimationoffutureexacerbationsforindividualpatients[15].Furthermore,mostapproachesonlyconcentrateontheperformancebutlackfurtherattentiontopersonalizedriskfactors.Recentpredictivemodeling-relatedstudiesfocusedmoreondeeplearning,whichhasanupperhandonhealthcarepredictionsbecauseofitsflexibilityindealingwithlongitudinaldata[16],powerfullearningcapabilities[17],andabilitytotackletheproblemofdatairregularity[18].Oneofthemostpopulararchitecturesofdeeplearning-basedpredictivemodelistherecurrentneuralnetworks(RNNs),whichtakeapatient’svisitsequenceastheinputandmakepredictionsaccordingtotheencodedrepresentations.Multiplesuccesseshavebeenachievedinapplyingdeeplearningondiseaseprediction[19],mostlyusingvariantsofRNNswithdistinctnetworkcomponents,suchasattentionmechanismforevaluatingweightsofeachvariable[20–22],andspecialconfigurationsintacklingtheproblemoftimedecays[20,21,23,24].Typicalpredictiontasksincludethepredictionofdiabetesmellitus[18],Parkinson[18,25], chronicheartfailure[21],sepsis[26],mortalityandreadmission[20].Inspiredbypreviousstudies,weappliedLongShort-termMemory(LSTM),apopularRNNvariantusedbydozensofpreviousstudies[19,20,23,27],forasthmaexacerbationprediction.WeproposedtheTime-SensitiveAttentiveNeuralNetwork(TSANN),whichemploysaself-attentionmechanism[28]tohelpmodelthecontextofbothvisit-levelandcode-levelvariables.Meanwhile,toincorporatetheimpactofelapsedtime,weprojectedthevisittimeofeachclinicalvariableintoalow-dimensionalspaceandassignedanumericvectortoeachtime.MakinguseoftheattentionweightsofTSANN,dataanalysiswasthenconductedtoinvestigatepersonalizedandcohort-levelriskfactors.

Asfarasweknow,thisisthefirstdata-drivenstudytopredictasthmaexacerbationusingdeeplearningandEHRdata,thefirstefforttointroduceelapsedtimeembeddingsintoclinicalpredictivemodeling,andthefirstattempttovisualizeriskfactorsofasthmaexacerbationonboththeindividuallevelandthecohortlevel,whichhavebeeninsufficientlyexploredinmostpreviousstudies.Webelievethat

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 4: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

theoutcomesofthisstudycanhelptheclinicalcommunitytobetterunderstandtheunderlyingmechanismsofthediseaseprogressionandtoassistindecision-making.Althoughfocusingonasthmaexacerbationforthisspecificproject,theproposedapproachcanalsobeadoptedinriskpredictionforotherchronicdiseases.

Methods

Database ThisstudyusedCernerHealthFacts®,aHIPAA-compliantdatabasecollectedfrommultipleenrolledclinicalfacilities,containingmostlyin-patientdata.DatainHealthFactswereextracteddirectlyfromtheEHRsfromhospitalswithwhichCernerhasadatauseagreement.Encountersmayincludethepharmacy,clinicalandmicrobiologylaboratory,admission,andbillinginformationfromaffiliatedpatientcarelocations.Allpersonalidentifyinginformationofthepatientswasanonymized.Inourstudy,weprimarilyfocusedontheimpactofclinicalfactorsonasthmaexacerbation,soweextracteddiagnosis,medications,anddemographiccharacteristicssuchasgender,race,andagefromthedatabase.TheUniversityofTexasHealthScienceCenter(UTHealth)hadagreementswithCernertousethisdataforresearchpurposes.TheinstitutionalreviewboardatUTHealthapprovedthestudyprotocol.

Study Design Weconductedaretrospectivestudytopredicttheriskofasthmaexacerbation.Weextractedpatients’recordsbetween1992and2015fromtheCernerdatabase.Forclarity,wedefineseveraltermsinadvance(Table1).Table 1. Defined terms for asthma exacerbation prediction. Term Definition index date the date of the first diagnosis of asthma in a patient’s EHR exacerbation date the date of the first diagnosis of asthma exacerbation after the

index date case group patients with asthma and later asthma exacerbations within 365

days and satisfy the inclusion and exclusion criteria (see Multimedia Appendix 1 for more details)

control group patients with asthma but without exacerbations within 365 days and satisfy the inclusion and exclusion criteria (see Multimedia Appendix 1 for more details)

prediction date training set: for the case group, the visit date before the exacerbation date; for the control group, the penultimate visit date within 365 days. testing set A: the 5th visit starting from the index date testing set B: defined analogous to the training set (following [21])

observed time window

the time window between the index date and the prediction date

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 5: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

Intuitively,atestingsampleshouldbedefinedinasimilarwayasthatforthetrainingsamples(astestingsetB).However,sincewecannotforeseewhentheexacerbationwouldhappeninreal-worlddeployment,wecanonlymakefuturepredictionsateachvisit.Inourstudy,weselectedthe5thvisitfromasthmaindexasthepredictiondate(testingsetA),consideringbothleveragingmorevisitsandkeepingmorepatientsforexperiments(seeMultimediaAppendix1formoredetails).WealsodefinedtestingsetB:thepenultimatevisitasthepredictiondate,behavingastheupperboundoftheclassifierperformance,sinceitenablesmorecompletevisitinformation.TheTSANNmodelwastrainedtopredicttheonsetofasthmaexacerbationgiventheobservedtimewindow.Themainoutcomesofthemethodare:(1)ascorethatmeasurestheriskofasthmaexacerbationforeachpatient;(2)visualizationoftheresultsincludingapersonalizedheatmapidentifyingtheimportanceofeachclinicalvariableintheobservedtimewindow,cohort-levelriskfactorsandtheirtemporaldistributionsamongpatients.Basedontheoutcomes,furtherdataminingorclinicaltrialscanbecarriedoutforvalidation.TheworkflowofthisresearchisshowninFigure1.

Figure1.Theworkflowofriskpredictionofasthmaexacerbation.

Selection of Study Subjects Thesubjectsinthestudywerepatientswithadiagnosisofasthma.Inclusionandexclusioncriteriaweredecidedbasedonpreviouswork[29,30]includingthediagnosisofasthmaandasthmaexacerbation.Thecurrentstudyonlyfocusedonadultpatientswithagebetween18and80.Intheend,31,433individualsremained,including2,262casesand29,171controls(≈1:13).ThecohortselectionprocessisshowninFigure2.MoredetailsforthecohortselectionareshownintheMultimediaAppendix1.

Time-sensitive Attention Neural Network

Model Overview TSANNacceptsthewholesequenceofclinicalvariablesintheobservedtimewindowasinputs,andoutputstheprobabilityofasthmaexacerbation(seeFigure

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 6: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

3).ThearchitectureofTSANNisbasedonLSTMwiththeadditionofhierarchicalattentionandelapsedtimeembeddings.

Figure2.Thecohortselectionprocessforthestudyofasthmaexacerbation.

Foreachvisit,multipleclinicalvariablesareencodedintheinputlayerandaveragedthroughthecode-levelattentionmechanism.Theelapsedtimeembeddingisconcatenatedasthecomplementaryinformationtoindicatetherelativetimeintervalbetweeneachvisitdateandthepredictiondate.LSTMacceptsthesequenceofencodedvisitsasinputsandoutputsfurtherencodingsforeachvisit.Thevisit-levelattentionlayeristhenappliedontheoutputsofLSTMtosummarizeallthevisitsforeachpatient.Finally,byfeedingtheoutputofvisit-levelattentionintothesoftmaxfunction,aprobabilityindicatingtheriskofdiseaseonsetisgenerated.

Figure3.TheoverviewoftheTSANNmodelforasthmaexacerbationprediction.

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 7: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

Input Theinputsofthemodelconsistoftwotypesoffeatures.OnetypeisclinicalvariablesincludingICDcodes,medications,anddemographicfeatures.TheICD-10codesareallconvertedintoICD-9basedonpredefinedmappings[31].Allthemedicationsarenormalizedtotheirgenericnames.Thedemographicfeaturesincludeage,gender,andrace,whichareonlytakenasinputsonthevisitofthepredictiondate.Usingaprojectionmatrix𝑊"#$% ∈ ℝ()×+) ,wemappedeachclinicalvariableintoaconceptembeddingvector:

𝐶-. = 𝑊"#$% ∙ 𝑥-. (1)

whereCijisthegeneratedconceptembeddingvectorand𝑥-. ∈ ℝ+) istheone-hotvectordenotingtheexistenceofeachvariable.Theotherfeaturetypeistimefeatures,whichindicatetheoccurrencetimeforeachclinicalvariable.Intuitively,variableswithdifferenttimestampswouldbehavedifferentlyinprediction.Forinstance,inmanycases,aclinicaleventthathappensseveraldaysagowouldplayamoreimportantrolethanonethathappenedseveralmonthsago.Meanwhile,duetothenatureofdatairregularityanddeficiencyofEHRs,successivevisitsalwayshavediversetimeintervals[23],whichmakesitindispensabletoconsiderthetimeelapsewhendoingpredictivemodeling.Inspiredbytheideaofpositionembeddingsinnaturallanguageprocessingwhichwereintroducedtomodelthepositionalinformationforeachwordine.g.relationclassification[32]andthetransformerstructureinneurallanguagemodeling[28,33,34],weintroducedelapsedtimeembeddingstorepresenttherelativetimegapforeachclinicalvariable.Specifically,takingthetimeofthepredictiondateT0asapivot,thetimefeatureofeachvariableistheabsolutedifferencebetweenitsoccurrencetimeTiandT0,i.e.T0-Ti.Sincetheobservedtimewindowhasanupperboundof365days,thevocabularysizeofthetimeembeddingswassettobe365.Weappliedamatrix𝑊2#$% ∈ ℝ(3×+3 toprojecteachtimevaluetoanm-dimensionvector.Unlikethecodeembeddings,elapsedtimeembeddingsarefedintothemodelafterthecode-levelattentionandassignedtoeachvisit.Theequationtogettheelapsedtimeembeddingforeachvisitisanalogoustothatforconceptembeddingswhere𝑡-. ∈ ℝ+3:

𝑇-. = 𝑊2#$% ∙ 𝑡-. (2)

Foreasierdescription,wedenoteeachvisitas andeachclinicalvariableineachvisitas ,whereTisthemaximumnumberofvisitsandMisthemaximumnumberofeventsineachvisit.

Code-level Attention Attentionisamechanismspecificallydesignedfordeepneuralnetworksthatactsasaninformationfilter,meanwhilehasthecapacityofalleviatinginformationlosswhendealingwithlongsequences.Itselectsimportantsequencespansbyassigningweightstodifferentelementsinasequence[35,36].Throughattention,eachvariableisassignedaweightsothatimportantvariableswouldhavelargerweightsthanthe

, {1,2,..., }iv i TÎ, {1,2,..., }ijv j MÎ

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 8: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

others.Weadoptedtheattentionmechanismfrom[37]inwhichtheweightofeachvariableisgeneratedaccordingtothesequenceandacontextvector.Concretely,giventhesetofcodes intheithvisit,theencodedrepresentationofvicanbegeneratedby:

(3)

(4)

(5)

whereWvandbvaretheweightandbiasformatrixtransformation,uijistheattentionvectorforeachcodejinvi,uvisthecontextvectorforviandisupdatedduringtraining,andbijistheattentionweightfortheeventvijbasedonwhichwecangeneratethefinalweightforthisvariable.

Visit-level Attentive LSTM Layer Takingtheencodedrepresentationofeachvisitasinput,LSTMmodelsthesequentialinformationintheobservedtimewindowandgetsthesummarizationatthefinalstep(thepredictiondate).TheadvantageofLSTMsoverbasicRNNsisthattheycanalleviatethevanishinggradientproblem,andarethusabletoretain“memories”ofpriortimestamps[38,39].LSTMsareimplementedbyseveralmatrixmultiplicationsandnonlineartransformationsthataimtomimicthememorymechanismofhumanbrains,inwhichtheseoperationsarecalledgates,signifyingthatthenetworkcanselecteffectiveinformationandabandonuselessinformation.TheequationsofLSTMsarelistedasfollows:

(6)

(7)

(8)

(9)

(10)(11)

whereWsandbsareweightsandbiasesfordifferentgatesorcells(ft:forgetgate,it:inputgate,Ct:memorycell,ot:outputgate,ht:hiddencell),andσistheactivationfunctionsuchastanhorsigmoid.

Weappliedself-attentionagainonthevisit-leveltomeasuretheriskscoresforeachclinicalvariableineachvisit.ByassigningattentionweightstotheoutputsofLSTMfromeachstep,wecanweighteachvisitintheobservedtimewindow.

(12)

{ }, {1,2,..., }ijv j MÎ

tanh( )ij v ij vu W v b= +

exp( )exp( )

Tij v

ij Tik v

k

u uu u

b =å

i ij ijj

v vb= ×å

1( [ , ] )t f t t ff W h v bs -= × +

1( [ , ] )t t t t ti W h v bs -= × +~

1tanh( [ , ] )t c t t cC W h v b-= × +

1* *t t t t tC f C i C-= + !

1( [ , ] )t o t t oo W h v bs -= × +*tanh( )t t th o C=

tanh( )i p i pu W v b= +

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 9: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

(13)

(14)

whereWpandbparetheweightandbiasformatrixtransformation,uiistheattentionvectorforeachvisitigivenvi,upisthecontextvector,andajistheattentionweightforvj.Thisprocesscanbeseenasasimulationtowardsthediagnosisprocedureofaclinicvisit,duringwhichaphysicianwouldlookbackintoapatient’sEHR,measuretheimpactsofeachhistoricalclinicaleventandmakeafinaldecision.

Output Thevisit-levelattentionlayercompressesalltheinformationintheobservedtimewindowintoafixedsizevector.Theoutputofattentionfurthergoesthroughafullyconnectedlayerwithanonlinearactivation.Asoftmaxfunctionisfinallyappliedtogeneratethepredictedprobabilityp

𝑝 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑊" ⋅ 𝑟? + 𝑏")(15)whererpstandsfortheoutputofvisit-levelattention-LSTM.Thevaluepisusedasthescorefortheriskofdevelopinganasthmaexacerbation.

Evaluation AUROC/AUC(AreaUndertheReceiverOperatingCurve)iswidelyusedasanevaluationmetricforpredictivemodelswhichreflectsabalancebetweensensitivityandspecificity[40].Accordingtothepredictedprobabilityp(between0and1)foreachinstance,theAUCvalueisgeneratedbysettingdifferentcut-offs.ThemethodslistedinTable2werecomparedinourexperiments.Table 2. The methods used for comparisons. Method Note LR-sparse A popular conventional machine learning algorithm[41], usually

behaves as a strong baseline in predictive modeling[42]. The input of LR-sparse for each sample is a fixed length feature vector, the length of which is the number of distinct variables (the vocabulary size) and the value of each dimension is the occurrence times of each variable. LR suffered from the data imbalance problem on this dataset, so we employed SMOTE[43] to do over-sampling and help reduce its impact.

LR-dense A simplified version of Multi-Layer Perceptron[44] with only one input layer and one softmax layer. In LR-dense, the representations of all the codes were averaged after being projected to the embedding space. The differences between LR-dense and LR-sparse are two-fold: a) the inputs of LR-dense is the average of code embeddings while the input of LR is a fixed length vector denoting the occurrence of each clinical variable; b) the input embeddings of LR-dense can be fine-tuned during training but the input of LR-sparse cannot.

exp( )exp( )

Ti p

i Tk p

l

u uu u

a =åp j j

jr va= ×å

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 10: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

LSTM The basic LSTM algorithm, taking the sequence of the clinical variables as input ordered by time; the variables in each visit are averaged.

ALSTM Attention LSTM with one layer of LSTM and one layer of attention. TLSTM[23] The time-aware LSTM model, which is one of the state-of-the-art

predictive models. In TLSTM, the time gap is used to compute the information decay in the LSTM unit.

RETAIN[21] A two-layer attention model, which is another state-of-the-art model for disease onset prediction. In RETAIN, the time features are not embedded as vectors, but real values denoting the gaps from the first visit.

TSANN-I The proposed TSANN model but with the second attention layer removed, the prediction is based on the final state of LSTM.

TSANN-I-step Use the time encoding method from [34] on TSANN-I, in which although time was also encoded using a vector, it only showed the order of each visit, but not the actual elapsed time, which is insufficient in modeling the irregularity in EHRs.

TSANN-II A complete version of the proposed TSANN model. Forevaluation,wefirstlysplitthedataintoatrainingsetandaheld-outtestingsetwitharatioof8:2.Further,5-foldcross-validationwasperformedonthetrainingdatasetforparametertuning.Duringcross-validation,gridsearchwasappliedtotunethehyperparameters.Finally,thehyperparametersforourbestmodelTSANN-Iwerebatchsize=32,codeembeddingsdimension=100,timeembeddingsdimension=20,learningrate=0.001,l2penalty=0.0001foralllayers,Leaky_ReLU[45]astheactivationfunctionforLSTM,addingbatchnormalizationbeforesoftmax,andAdam[46]astheoptimizer.AmoredetailedparametertuningprocessisshownintheMultimediaAppendix1.CodesforRETAINandTLSTMwereprovidedbytherespectiveauthors,andallotherdeeplearningmodelswereimplementedwithTensorFlow[47]andtestedonNvidiaTeslaV100,QuadroP6000andTitanXPGPUs.

Results

AUC Values SincethetimeinformationisofcriticalimportanceinmodelingEHRdata,weconductedexperimentsonsituationsbothwithandwithoutit.WedidnotimplementLR-sparsewithtimeembeddingsassociatedwithdayasthetimeunitsinceitwouldhaveintroducedagreaternumberofvariables(i.e.12,390*365),whichwouldhavebeentoosparseanddifficultforcomputation.Instead,wereducedthevocabularyofthetimevariablebysettingmonthasthetimeunitandfinally148,680distinctclinicalvariablesweregenerated.ForTLSTM,weonlyconsideredawith-timeversionsinceitisdefinedasatime-awarevariantofLSTM.ForLR-dense,LSTM,ALSTM,TSANN-IandTSANN-II,weusedtheelapsedtimeembeddingsintroducedinthisstudy.TheAUCvaluesonthetestingsetforallthemethodsareshowinTable3.

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 11: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

Table 3. AUC values by the proposed models compared with baselines. (+/- stands for the improvement of adding time info).

Method Without time With time +/-(%) LR-sparse 0.5685 0.5825 +1.4

LR-dense 0.6545 0.6753 +2.08

LSTM 0.6045 0.6567 +5.22

ALSTM 0.6346 0.6714 +3.68

TLSTM - 0.6548 -

RETAIN 0.6455 0.6882 +4.27

TSANN-I 0.6692 0.7003 +3.11

TSANN-I-step 0.6463 - -

TSANN-II 0.6827 0.6855 +0.28

*the optimal value for each column is marked in bold. Comparedwithdifferentrows,wenoticethatTSANN-IwithtimeinformationachievestheoptimalAUCvalue,improvingthestrongestbaseline(RETAIN)by1.21%.TSANN-IIgetscomparableperformancewithRETAIN.AlltheresultsshowthattheconventionalmachinelearningmethodLR-sparsebehavesworsethanthedeeplearningmethods.ItisnoticedthatLR-denseperformsbetteronbothwithandwithouttimethanLSTMandALSTM.TSANN-I-step,whichonlyusedtimeembeddingstodenotetherelativepositionofeachvisit,doesnotgetgoodresult.

Whencomparingtheresultswithandwithouttimeinformation,considerableimprovementswereobservedafteraddingtimeinformationonmostmethods,signifyingthattheproblemofdatairregularityisobviousinthestudiedproblemandthetimeinformationplaysanimportantroleinmodelingvisitsforpredictingasthmaexacerbation.Forexample,TSANN-Iintegratedthetimeintervalintodecayfunctionsandobtaineda3.11%improvement.SimilarcasescanalsobeobservedinothermodelsincludingRETAIN.Surprisingly,TSANN-II,whenintegratingtimeembeddings,didnotgetmuchimprovement.Inaddition,ifwithouttimeinformation,ourproposedmethodsalsoperformmuchbetterthanothers,showingthattheyhavestrengthsevenincasessuchasthelossoftemporalinformation.ConsiderableimprovementscanalsobeobservedontestingsetB(RETAIN:0.7761,TSANN-I:0.8202)aswellasthecontributionofaddingthetimeinformation.Andasexpected,thegeneralresultsweremuchbetterontestingsetB(seeMultimediaAppendix1).

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 12: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

Personalized Heatmap Inourstudy,aheatmapconveystheinterpretationsandbehavesasavisualizationtoolinidentifyingthepersonalizedriskfactors.Aheatmapistoillustratehoweachcandidateriskfactorbehavesineachvisitintheprogressionofasthma.Eachgridintheheatmapiscoloredbasedontheattentionweightsderivedfromthemodel.Thedarkeranareais,themoreimportancetheclinicalvariablesignifies,andthemorepossibleitbehavesasariskfactor.Forexample,Figure4showsacasewherethesymptomsofhypoxemia,shortnessofbreathandwheezing(799.02,786.05and786.07inICD9),etc.arerecognizedaspossibleriskfactors.Apossibleexplanationmightbethepatient’sstatusofhypoxemiaworsenedtheconditionofasthma,followingsymptomsinbreath,andasthmaexacerbationwasthendiagnosed.

Figure4.Anexampleofheatmapwiththemostpossibleriskfactordenotedbytheclinicalvariables:hypoxemia(D_799.02),shortnessofbreath(D_786.05)andwheezing(D_786.07),etc.

Itishardtogetaclearoverviewofthediseaseprogressionsincewecanonlydependonstructureddatabutwithoutanyclinicalnotes.Asaresult,wecanhardlyconfirmthediscoveredfactorsarerealriskfactorsbutonlyknowthattheymightbeeitherpossiblefactorstriggeringexacerbationsorofhighassociationswiththeevent.However,ourmethodmaybehaveasanimportantcomplementindiseasepredictionandclinicaldecision.

Cohort-level Risk Factors Wealsoproposedamethodtodiscovercommonriskfactorsonthepopulationlevelsothatclinicianscanhaveabetterunderstandingtowardsthediseaseprogressionwhilepatientscanpaymoreattentiontoriskfactorsindailylives.Therecognizedtop-rankedriskfactorsareshowninTable4.ThedetailsofthemethodandclinicaldiscussiontothefactorsaredescribedinMultimediaAppendix1.

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 13: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

Table 4. Clinical variables with the top-ranked weights (/N stands for the variable occurred in N months prior to the prediction date). We regard both * and ▲ containing valuable information.

ICD-9 (Diagnosis) Medication 1 493.9x/0-5 (asthma)* (meaning diagnosed with

asthma multiple times before exacerbation) methylprednisolone/0,1**

2 786.07/0-2 (wheezing)△ prednisone/0,1,2** 3 496.0/0,1 (chronic airway obstruction not

elsewhere classified)▲ ipratropium/0,1,2**

4 530.81/0 (esophageal reflux)* midazolam/0,1,2△ 5 V46.2/0 (dependence on supplemental

oxygen)△ hydromorphone/0-2▲

6 787.02/0 (nausea alone)△ heparin/0,1△ 7 786.50/0 (unspecified chest pain)△ acetaminophen-oxycodone/0* 8 V08/042/0 (HIV related)▲ fentanyl/0▲ 9 786.59/0 (other chest pain)△ methylprednisolone/2-4** 10 786.05/0 (shortness of breath)△ glycopyrrolate/0* 11 V58.69/0 (long-term (current) use of other

medications)▲ lidocaine/0△

12 784.0/0 (headache)▲ dexamethasone/0△ 13 346.90/0 (migraine, unspecified, without

mention of intractable migraine without mention of status migrainosus)▲

promethazine/0△

14 V58.66/0 (long-term (current) use of aspirin)* atorvastatin/0△ 15 491.21/0 (obstructive chronic bronchitis with

(acute) exacerbation)▲ furosemide/0**

*Possible risk factors of asthma exacerbations △These factors were comorbidities or combined medications. We believe they were not risk factors of asthma exacerbations. ▲It could hardly be determined whether these factors caused asthma exacerbations but they were with high associations. ** These medications can be used to treat asthma or control asthma symptoms. In the study, it could hardly be determined whether these medications are risk factors since we were unable to investigate the dosage of these medication in the current study. Inappropriate medications use Short-Acting Beta Agonists (SABA)/Inhaled Corticosteroids (ICS) could also lead to asthma exacerbations though.

Discussion

Principle Results OurproposedmethodobtainstheoptimalAUCvalueonthepredictiontask,withhierarchicalattentionandelapsedtimeembeddingsasitsbooster.Thevisualizationpartalsoprovidesusefultracksforbetterunderstandingthediseaseprogression.

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 14: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

RegardingtheAUCvalues,sinceweonlyselected5visitsastheobservedtimewindow,LSTMandALSTMmaynotbeaspowerfulastheywereinmodelinglongersequences,whichcanbereflectedbytheunsatisfyingresultscomparedwithLR-dense.However,forTSANN,wheremorecomplexattentivestructureswereadded,theresultsgetcomparablewithorbetterthanLR-dense.TLSTM,althoughhasalsointegratedthetimedecayinformation,doesnotgetsatisfyingresults,perhapsduetotheimproperheuristicdecayingfunctionforthecurrentdataset.TSANN-Iand-IIobtainbetterresultsasRETAIN,signifyingtheeffectivenessofthemodelstructure.Besides,thehierarchicalattentionarchitecturemakesiteasierforfurtherinterpretations,e.g.thepersonalizedheatmap.LR-sparse,althoughhasbeentunedthoroughlyinourexperiment,stillbehavesworsethantheothers,whichmaypartlyduetoitsinsufficiencyinmodelingcomplexsequentialpatterns.AtypicalcharacteristicoftheEHRdataisirregularity,whichmeansthatclinicvisitsmayberandomlyandsparselydistributedalongthetimeline,andsometimesareevenmissing.Thus,thepredictivemodelisresponsibleofserializingthevisitsforeachpatientwiththeconsiderationoftimeelapsesbetweencontinuousvisits.ThecomparisonsbetweenresultswithandwithouttimeinformationinTable3demonstratetheeffectivenessofconsideringtimeelapsesonthisstudycohort.Itcanbeconcludedthatthepredictionofasthmaexacerbationisquitetime-sensitive,andmostofthecriticalriskfactorsshouldhavebeentimestamped.Forinstance,evenforavisitjustinfrontofthepredictiondate,iftheoccurrencetimeofthisvisitislongbefore,itsimpactwouldstillbereduced.Similarcasecanalsobefoundin[23]withanimprovementof6%fromLSTMtoTLSTM.ForTSANN-I-step,althoughtimeembeddingswerealsoused,theywereonlyusedtodenotetherelativepositionofeachvisitinthesequencebutlacktheabilitytorepresenttimedecay,whichcanhardlygetgoodresultshere.AddingtimetoTSANN-IIdidnotgetmuchimprovementasthoseinothermethods,whichwethinkmightbeattributedtothattheadditionofthevisit-levelattentionweakensthecontributionofthetimeembeddings.

Apartfromthepersonalizedheatmapsandcohort-levelriskfactors,makinguseoftheweightsgeneratedbyEq.2andEq.3inMultimediaAppendix1,wecanalsovisualizehoweachclinicalvariablecontributesacrosstime,e.g.avariablemaybehavedistinctlyamongindividuals,withdifferentactiontimeordifferentincidences.Figure5-6showtwoexamplesinwhichthetimedistributionsfortheclinicalvariablesaredisplayedthroughscatters.Inthesescatters,eachcirclerepresentsapatientanditssizeandcolordepthdenotetheimportanceofthecorrespondingvariableforthepatient.Inthefigures,thex-axisstandsforthetimegapbetweentheoccurrencedateofthevariabletothepredictiondate,whilethey-axiswasemployedmerelyforcosmesis.Werandomlyselectamaximumof2,000patientstoplotthisfigure.Figure5-6arederivedfromanICDcode(esophagealreflux:530.81inICD9)andamedication(fentanyl)respectively.Weobservedifferenteffectivetimerangesforthesetwofactors,wherethefirstfactortendstodistributemorebetweentheprevious250to50dayswhilethesecondbetweentheprevious150days.Wehope

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 15: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

thesevisualizationscanhelpfigureoutthedistributionsofmorepossibleriskfactorstoaidtheasthmacontrol.

Fig5.ThetimedistributionoftheclinicalvariableICD-9:530.81(gastro-esophagealreflux

disease)asariskfactor.

Fig6.Thetimedistributionoftheclinicalvariablemedication:fentanylasariskfactor.

Comparison with Prior Work Asfarasweknow,thisisthefirststudyindeeplearning-basedpredictiononasthmaexacerbation,andfromTable3,weobservedthatourproposedmethodoutperformsbothconventionalmachinelearninganddeeplearningmethods.Theperformanceboostsmainlycomefromthearchitectureofhierarchicalattentions,whichcanefficientlycaptureinformationfromdistinctmedicalelements,andthewayofencodingtimeusingelapsedtimeembeddings,whichenableslearningtemporalpatternsfromdifferentperspectives.Generally,deeplearning-based

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 16: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

clinicalpredictivemodelingappliedRNNstylestructures,withminormodificationsinstructures.Forexample,[20]usedaone-layerRNNand[21]usedatwo-layerRNNwiththeirattentionweightsbutoneattentionisforevaluatingeachembeddingdimension.Incomparison,weappliedattentionweightsonboththecodeandvisitlevel,whichisalsoeasyfortheinterpretationofresults.

StrategiesofencodingtimeinpreviousstudiescanbegenerallycategorizedintolearningasubspacedecompositionofthecellmemoryinRNNstoenabletimedecay[20,23]ortakingthetimevaluesasfeatures[21,48].Thesemethods,sinceonlyusedtimeasasinglevalue,limitedtherepresentationabilityoftimeifmultiplepossiblepatternsexistinthetimeline,e.g.aclinicaleventchappenedintimetmightbemodeledjointlyincausal-likepatternssuchasat1->ctandct->bt2,inwhichtmaybehavedifferently.

Limitations and Future Work Byusingdeeplearning,weofferedanovelwayofidentifyingpossibleriskfactorsandpredictingtheriskofasthmaexacerbation.However,thecurrentworkstillhassomelimitations.First,forthemodelinterpretationpart,howmultipleclinicalvariablesinteractwitheachotherneedsfurtherexploration,simplyconsideringeachvariableindependentlymaylossthedependencypatternsbetweenthem,e.g.theprescriptionofadrugmightbecloselyassociatedwithadiseaseorsymptom.Secondly,EHRshavetheirowndrawbackssuchasdatairregularity,sparsity,andnoise.SomepotentialriskfactorsofasthmaexacerbationsmightnotberecordedinEHRs.Asaresult,theinformationintegritycannotbewellguaranteed.Wemayneedtofindwaystomakethedatamorecompletesuchasincludinginformationfromtextualreportsorpatientsurveys.Finally,theperformanceofthemodelstillhasroomtoimprove.Itmightbeboostedfurtherbydesigningmorepowerfulstructuresorincludingbackgroundknowledge.

Conclusion Inthispaper,weproposedanattentivedeeplearningnetworkforasthmaexacerbationpredictionandemployedelapsedtimeembeddingstomodelthetimedecays.Byleveragingtheweightsofthemodel,wenotonlygeneratedpersonalizedheatmapsandspecificriskscoresattheindividual-level,butalsoidentifiedpossibleriskfactorsofasthmaexacerbationatthecohort-level.Comparedwithpreviousstudies,ourmodeliseffectiveinmodelingtimeinformationandobtainsbetteroverallAUCs.Sincethemodeliscompletelydata-drivenandrelieslittleonfeatureengineering,itcaneasilybegeneralizedtootherpredictiontasks.Toourbestknowledge,thisisthefirststudytopredictasthmaexacerbationrisksusingadeeplearningmodelandincludeselaspsedtimeembeddings.Someofthetop-rankedriskfactorsidentifiedhavegainedsupportingevidencefrompreviousmedicalresearches,whichprovedourmethodhasgoodreliabilityandaccuracy.

Acknowledgements CTconceivedtheresearchproject.YX,HJandCTdesignedthepipelineandmethod.YXimplementedthedeeplearningmodelofthestudyandpreparedthemanuscript.HJcompletedtheclinicalpartofthemanuscript.WJZandHXprovidedvaluable

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 17: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

suggestionsonthecohortselectionandexperimentdesign.YZhouandYZhangextracted,cleanedthedataanddidstatistics.LRhelpedreorganizethedataanddidnormalizationsfortherevisedversion.FL,JD,SW,DZandCTproofreadthepaperandprovidedvaluablesuggestions.Alltheauthorshavereadandapprovedthefinalmanuscript.

WethankDr.IrmgardWillcocksonforproofreading.Also,thankstoCernerforprovidingthevaluableHealthFactsEMRdata.WegratefullyacknowledgethesupportofNVIDIACorporationwiththedonationoftheQuadroP6000andTITANXPGPUsusedforthisresearch.ThisresearchwaspartiallysupportedbytheNationalLibraryofMedicineoftheNationalInstitutesofHealthunderawardnumberR01LM011829,theNationalInstituteofAllergyandInfectiousDiseasesoftheNationalInstitutesofHealthunderawardnumber1R01AI130460,NationalCenterforAdvancingTranslationalSciencesoftheNationalInstitutesofHealthunderawardnumberU01TR02062,andtheCancerPreventionResearchInstituteofTexas(CPRIT)TrainingGrant#RP160015.

Conflicts of Interest Nonedeclared.

Abbreviations ALSTM:AttentionLongShort-TermMemoryAUROC/AUC:AreaUndertheReceiverOperatingCurveEHRs:ElectronicHealthRecordsICS:InhaledCorticosteroidsLR:LogisticRegressionLSTM:LongShort-TermMemoryRETAIN:REverseTimeAttentIoNmodelRNNs:RecurrentNeuralNetworksSABA:Short-ActingBetaAgonistsSMOTE:SyntheticMinorityOver-samplingTechniqueTLSTM:Time-awareLongShort-TermMemoryTSANN:Time-SensitiveAttentiveNeuralNetwork

References 1 OrganizationWH.Asthma.

http://www.who.int/mediacentre/factsheets/fs307/en/.2 CDC.MostRecentAsthmaData.

https://www.cdc.gov/asthma/most_recent_data.htm.3 GINA.PocketGuideforAsthmaManagement.PocketGuidasthmaManagPrev

2018.4 NurmagambetovT,KuwaharaR,GarbeP.Theeconomicburdenofasthmain

theUnitedStates,2008-2013.AnnAmThoracSoc2018;15:348–56.doi:10.1513/AnnalsATS.201703-259OC

5 WarkPAB,GibsonPG.Asthmaexacerbations·3:Pathogenesis.Thorax

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 18: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

2006;61:909–15.doi:10.1136/thx.2005.0451876 LevyML.Thenationalreviewofasthmadeaths:Whatdidwelearnandwhat

needstochange?Breathe2015;11:15–24.doi:10.1183/20734735.0089147 FlemingL.Asthmaexacerbationprediction.CurrOpinAllergyClinImmunol

2018;:1.doi:10.1097/ACI.00000000000004288 AzizpourY,DelpishehA,MontazeriZ,etal.EffectofchildhoodBMIonasthma:

Asystematicreviewandmeta-analysisofcase-controlstudies.BMCPediatr2018;18:1–13.doi:10.1186/s12887-018-1093-z

9 LieuTA,QuesenberryCP,SorelME,etal.Computer-basedmodelstoidentifyhigh-riskchildrenwithasthma.AmJRespirCritCareMed1998;157:1173–80.doi:10.1164/ajrccm.157.4.9708124

10 StanfordRH,NagarS,LinX,etal.UseofICS/LABAonAsthmaExacerbationRiskinPatientsWithinaMedicalGroup.JManagcareSpecPharm2015;21:1014–9.doi:10.18553/jmcp.2015.21.11.1014

11 BzdokD,AltmanN,KrzywinskiM.Statisticsversusmachinelearning.NatPublGr2018;15:233–4.doi:10.1038/nmeth.4642

12 DexheimerJW,BrownLE,LeegonJ,etal.ComparingDecisionSupportMethodologiesforIdentifyingAsthmaExacerbations.2007;:880–4.

13 JeongJF,CheolI.Machinelearningapproachestopersonalizeearlypredictionofasthmaexacerbations.AnnNYAcadSci2017;1387:153–65.doi:10.1016/j.coviro.2015.09.001.Human

14 SandersDL,AronskyD.DetectingAsthmaExacerbationsinaPediatricEmergencyDepartmentUsingaBayesianNetwork.2006;:684–8.

15 LoymansRJB,DebrayTPA,HonkoopPJ,etal.ExacerbationsinAdultswithAsthma:ASystematicReviewandExternalValidationofPredictionModels.JAllergyClinImmunolPract2018;6:1942-1952.e15.doi:10.1016/j.jaip.2018.02.004

16 BaeSH,ChoiI,KimNS.AcousticSceneClassificationUsingParallelCombinationofLSTMandCNN.ProcDetectClassifAcoustScenesEvents2016Work2016;:11–5.

17 LecunY,BengioY,HintonG.Deeplearning.Nature2015;521:436–44.doi:10.1038/nature14539

18 BaytasIM,XiaoC,ZhangX,etal.PatientSubtypingviaTime-AwareLSTMNetworks.Proc23rdACMSIGKDDIntConfKnowlDiscovDataMin-KDD’172017;:65–74.doi:10.1145/3097983.3097997

19 XiaoC,ChoiE,SunJ.Opportunitiesandchallengesindevelopingdeeplearningmodelsusingelectronichealthrecordsdata:asystematicreview.JAmMedInformaticsAssoc2018;00:1–10.doi:10.1093/jamia/ocy068

20 RajkomarA,OrenE,ChenK,etal.Scalableandaccuratedeeplearningforelectronichealthrecords.PublishedOnlineFirst:2018.doi:10.1038/s41746-018-0029-1

21 ChoiE,BahadoriMT,KulasJA,etal.RETAIN:AnInterpretablePredictiveModelforHealthcareusingReverseTimeAttentionMechanism.PublishedOnlineFirst:2016.http://arxiv.org/abs/1608.05745

22 MaF.Dipole :DiagnosisPredictioninHealthcareviaAttention-basedBidirectionalRecurrentNeuralNetworks.2017.

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 19: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

23 BaytasIM,XiaoC,ZhangX,etal.PatientSubtypingviaTime-AwareLSTMNetworks.Proc23rdACMSIGKDDIntConfKnowlDiscovDataMin-KDD’172017;:65–74.doi:10.1145/3097983.3097997

24 WuS,LiuS,SohnS,etal.ModelingAsynchronousEventSequenceswithRNNs.JBiomedInformPublishedOnlineFirst:2018.doi:10.1016/j.jbi.2018.05.016

25 CheC.AnRNNArchitecturewithDynamicTemporalMatchingforPersonalizedPredictionsofParkinson’sDisease.

26 JinH,YoungH.Learningrepresentationsfortheearlydetectionofsepsiswithdeepneuralnetworks.ComputBiolMed2017;89:248–55.doi:10.1016/j.compbiomed.2017.08.015

27 JinH,YoungH.Learningrepresentationsfortheearlydetectionofsepsiswithdeepneuralnetworks.ComputBiolMed2017;89:248–55.doi:10.1016/j.compbiomed.2017.08.015

28 VaswaniA,ShazeerN,ParmarN,etal.AttentionIsAllYouNeed.PublishedOnlineFirst:2017.doi:10.1017/S0952523813000308

29 GINA.GlobalStrategyForAsthmaManagementandPrevention.GlobInitiatAsthma2017;:http://ginasthma.org/2017-gina-report-global-strat.doi:10.1183/09031936.00138707

30 BaiTR,VonkJM,PostmaDS,etal.Severeexacerbationspredictexcesslungfunctiondeclineinasthma.EurRespirJ2007;30:452–6.doi:10.1183/09031936.00165106

31 MappingbetweenICD-10andICD-9.https://www.health.govt.nz/nz-health-statistics/data-references/mapping-tools/mapping-between-icd-10-and-icd-9(accessed1Jan2019).

32 DaojianZeng,KangLiu,SiweiLaiGZandJZ.RelationClassificationviaConvolutionalDeepNeuralNetwork.In:ProceedingsofCOLING2014,the25thInternationalConferenceonComputationalLinguistics:TechnicalPapers.2014.2335–44.doi:10.1021/bi990527s

33 AlecR,KarthikN,TimS,etal.ImprovingLanguageUnderstandingbyGenerativePre-Training.OpenAI2018;:1–10.doi:10.1093/aob/mcp031

34 SongH,RajanD,ThiagarajanJJ,etal.Attendanddiagnose:Clinicaltimeseriesanalysisusingattentionmodels.32ndAAAIConfArtifIntellAAAI20182018;:4091–8.

35 BahdanauD,ChoK,BengioY.NeuralMachineTranslationbyJointlyLearningtoAlignandTranslate.2014;:1–15.doi:10.1146/annurev.neuro.26.041002.131047

36 XiangY,ChenQ,WangX,etal.AnswerSelectioninCommunityQuestionAnsweringviaAttentiveNeuralNetworks.IEEESignalProcessLett2017;24:505–9.doi:10.1109/LSP.2017.2673123

37 YangZ,YangD,DyerC,etal.HierarchicalAttentionNetworksforDocumentClassification.Proc2016ConfNorthAmChapterAssocComputLinguistHumLangTechnol2016;:1480–9.doi:10.18653/v1/N16-1174

38 HochreiterS,UrgenSchmidhuberJ.LongShort-TermMemory.NeuralComput1997;9:1735–80.doi:10.1162/neco.1997.9.8.1735

39 SakH,SeniorA,BeaufaysF.Longshort-termmemoryrecurrentneural

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint

Page 20: Asthma Exacerbation Prediction and Interpretation based on … · Abstract Background: Asthma exacerbation is an acute or sub-acute episode of progressive worsening of asthma symptoms

networkarchitecturesforlargescaleacousticmodeling.Interspeech20142014;:338–42.doi:arXiv:1402.1128

40 MandicS,GoC,AggarwalI,etal.Relationshipofpredictivemodelingtoreceiveroperatingcharacteristics.JCardiopulmRehabilPrev2008;28:415–9.doi:10.1097/HCR.0b013e31818c3c78

41 HosmerJr,DavidW.,StanleyLemeshowandRXS.Appliedlogisticregression.2013.

42 ChoiE,BahadoriMT,KulasJA,etal.RETAIN:aninterpretablepredictivemodelforhealthcareusingreversetimeattentionmechanism.PublishedOnlineFirst:2016.http://arxiv.org/abs/1608.05745

43 Chawla,NiteshVandBowyer,KevinWandHall,LawrenceOandKegelmeyerWP.SMOTE:SyntheticMinorityOver-samplingTechniqueNitesh.JArtifIntellRes2002;16:321–57.doi:10.1613/jair.953

44 Pal,SankarKandMitraS.MultilayerPerceptron,FuzzySets,Classifiaction.IEEETransNeuralNetworks1992;3:683696.

45 XuB,WangN,ChenT.EmpiricalEvaluationofRectifiedActivationsinConvolutionNetwork.2015.

46 KingmaDP,BaJ.Adam:amethodforstochasticoptimization.In:Iclr.2015.doi:http://doi.acm.org.ezproxy.lib.ucf.edu/10.1145/1830483.1830503

47 Mart´ınAbadi,AshishAgarwalPBetal.TensorFlow:Large-ScaleMachineLearningonHeterogeneousDistributedSystems.2015.doi:10.1093/library/s4-X.3.339

48 RajkomarA,OrenE,ChenK,etal.Scalableandaccuratedeeplearningforelectronichealthrecords.npjDigitMed2018;:1–10.doi:10.1038/s41746-018-0029-1

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)

(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint