Transcript
Page 1: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

Forebraindopaminevaluesignalsariseindependentlyfrommidbraindopaminecellfiring.AliMohebi1*,JeffreyPettibone1*,ArifHamid2,Jenny-MarieWong3,RobertKennedy3,JoshBerke1.

1DepartmentofNeurology,UniversityofCalifornia,SanFrancisco2DepartmentofNeuroscience,BrownUniversity3DepartmentofChemistry,UniversityofMichigan,AnnArbor.*equalcontributions.

Themesolimbicdopamineprojectionfromtheventraltegmentalarea(VTA)tonucleus

accumbens(NAc)isakeypathwayforreward-drivenlearning,andforthemotivationtoworkformorerewards.VTAdopaminecellfiringcanencoderewardpredictionerrors(RPEs1,2),vitallearningsignalsincomputationaltheoriesofadaptivebehavior.However,NAcdopaminereleasemorecloselyresemblesrewardexpectation(value),amotivationalsignalthatinvigoratesapproachbehaviors3-7.Thisdiscrepancymightbeduetodistinctbehavioralcontexts:VTAdopaminecellshavebeenrecordedunderhead-fixedconditions,whileNAcdopaminereleasehasbeenmeasuredinactively-movingsubjects.Alternativelythemismatchmayreflectchangesinthetonicfiringofdopaminecells8,orafundamentaldissociationbetweenfiringandrelease.Herewedirectlycomparedopaminecellfiringandreleaseinthesameadaptivedecision-makingtask.Weshowthatdopaminereleasecovarieswithrewardexpectationintwospecificforebrainhotspots,NAccoreandventralprelimbiccortex.Yetthefiringratesofoptogenetically-identifiedVTAdopaminecellsdidnotcorrelatewithrewardexpectation,butinsteadshowedtransient,error-likeresponsestounexpectedcues.Weconcludethatcriticalmotivation-relateddopaminedynamicsdonotarisefromVTAdopaminecellfiring,andmayinsteadreflectlocalinfluencesoverforebraindopaminevaricosities.

____________________________________________________________________________________________________ Wetrainedratsinanoperant,trial-and-error,“bandit”task7(Fig.1a,b).Oneachtrial

illuminationofanosepokeport(Light-On)promptedapproachandentryintothatport(Center-In).Afteravariableholdperiod(0.5-1.5s),awhitenoiseburst(GoCue)ledtherattowithdraw(Center-Out)andpokeoneofthetwoimmediatelyadjacentports(Side-In).OnrewardedtrialsthisSide-Ineventwasaccompaniedbyanaudiblefoodhopperclick,promptingtherattocollectasugarpelletfromaseparatefoodport(Food-Port-In).Leftwardandrightwardchoiceswereeachrewardedwithindependentprobabilities,whichoccasionallychangedwithoutwarning.Whenratsweremorelikelytoreceiverewards,theyweremoremotivatedtoengageintaskperformance.Thiswasapparentintheir“latency”–thetimebetweenLight-OnandCenter-In-whichwassensitivetotheoutcomeoftheprecedingfewtrials(Fig.1c)andtherebyscaledinverselywithrewardrate(Fig1b).

Wecomparedhowdopaminefiringandreleasevarywithrewardrateandmotivation.First,

weusedmicrodialysiscombinedwithliquidchromatography–massspectrometry9tosimultaneouslyassay21differentneurotransmittersandmetabolitesduringbandittaskperformance,eachwith1mintimeresolution.Probestargetedsevendistinctforebrain

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 2: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

subregionswithinmedialfrontalcortexandstriatum(Fig.1d;SupplementaryFig.1).Regressionanalysescomparedchemicaltimeseriestoarangeofbehavioralfactors(SupplementaryFig.2).Wereplicatedourpriorfinding(inadifferentsetofrats)that–unlikeotherneurotransmitters–mesolimbicdopaminespecificallycorrelateswithrewardrate7(Fig.1d,e,f).However,wefoundthatthisrelationshipwaslocalizedtoNAccore,andwasnotseeninNAcshellordorsal-medialstriatum.Similarly,dopaminereleasecorrelatedwithrewardrateinventralprelimbiccortex,butnotinmoredorsalorventralportionsofmedialfrontalcortex(Fig.1d,f).Thisobservationoftwin“hotspots”ofvalue-relateddopaminereleasewasunexpected,giventhatcorticalandstriataldopaminearegenerallyconsideredtohaveverydifferentkineticsandfunctions10,11.YetthisspatiotemporalpatternhasanintriguingparallelinhumanfMRIstudies,whichconsistentlyfindthatBOLDsignalcorrelateswithsubjectivevaluespecificallyinNAcandventral-medialprefrontalcortex12,13.

TheNAccorereceivesdopamineinputfromlateralportionsofVTA(VTA-l;14,15.Inhead-fixed

mice,VTA-ldopamineneuronsreportedlyhaveuniform,RPE-likeresponsestoconditionedstimuli16.However,toourknowledge,identifieddopaminecellshavenotbeenrecordedinunrestrainedanimalsperformingbehavioraltasks.Toachievethisweusedoptogenetictagging2,17,18inTH:Crerats19.AfterinfectingtheVTAwithavirusforCre-dependentexpressionofchannelrhodopsin(AAV-DIO-ChR2),optrodes(Fig.2a)wereusedtorecordsingle-unitresponsestobriefbluelaserpulses(Fig2b;SupplementaryFigs.3,4,5).Of122well-isolatedVTA-lunits,27showedreliableshort-latencyincreasesinfiringtolightonsetandwereconsideredidentifieddopamineneurons(seeMethods).Alldopamineneuronsweretonically-active,withrelativelylowfiringrates(mean7.7Hz;range3.7-12.9Hz;comparedtotheaverageofallVTAneurons,p<0.001one-tailedMann–Whitney).Theyalsotypicallyhadlonger-durationspikewaveforms(comparedtoallVTAneurons,p<5x10-6,one-tailedMann–Whitney),althoughtherewereclearexceptions(Fig.2b),confirmingpriorreportsthatwaveformdurationaloneisaninsufficientmarkerofdopaminecells2,20.AdistinctclusterofVTAneurons(n=38)hadbriefwaveforms,higherfiringrates(>20Hz;mean41.3Hz,range20.1-97.1Hz),andincludednotaggeddopaminecells.WepresumethattheseareGABAergicand/orglutamatergic2,21,andrefertothemas“non-dopamine”cellsbelow.

Recordingsweretypicallystableformanyhours,allowingustoexamineactivitypatternsof

thesameindividualdopaminecellsacrossmultiplebehavioraltasks.ForbettercomparisonwithpreviousworkwefirstshowresponsestounpredictedfooddeliveryandPavlovianconditionedcues.Aseriesoftonepipswerefollowedbyrewarddeliverywithdifferentprobabilities(zero,medium,high)dependingonthetonepitch.Duringpriortrainingratshadlearnedaboutthesedifferentprobabilities,asindicatedbytheircorrespondingscaledlikelihoodofenteringthefoodportduringcuepresentation(Fig.2d).Thethreedifferentcues,togetherwithoccasionalunheraldedfooddeliveries,weregiveninrandom,interleavedorder(withinter-trialintervalof15-30s).Identifieddopamineneuronsrespondedmoststronglytounanticipatedfoodhopperclicks,andprogressivelylessstronglywhentheseclickswereprecededbythemedium-probabilityandhigh-probabilitycues(Fig.2d,e).Conversely,atcueonsetdopaminecells

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 3: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

respondedmoststronglytothehigh-probabilitycue,andprogressivelylessstronglytothemedium-andzero-probabilitycues.Thispatternisbroadlyconsistentwithpriorreports22-24,confirmingthatVTA-ldopaminecellscandisplaycanonicalRPE-likecodingeveninunrestrainedanimals.

Wethenturnedtothebandittask.Basedonevidencefromanesthetizedanimals,ithasbeen

arguedthataltereddopaminelevelsmeasuredwithmicrodialysisarisefromchangesinthetonicfiringrateofdopaminecells,and/ortheproportionofactiveversusinactivedopamineneurons8,25.Wethereforeassessedwhetherthesefactorsvarywithrewardrate,inamannerthatcouldaccountforourmicrodialysisobservations.

Unlikeforebraindopaminerelease,tonicdopaminecellfiringineachblockoftrialswas

strikinglyindifferenttorewardrate(Fig.3a).Therewasnosignificantchangeinthefiringratesofindividualdopaminecells–oranyotherVTA-lneurons-betweenhigher-andlower-rewardblocks(Fig.3b,c;seealsoref.26forconcordantresultsinhead-fixedmice).Furthermore,weneverobservedanydopaminecellsswitchingbetweenactiveandinactivestates.Theproportionoftimethatdopaminecellsspentinlonginter-spike-intervalswasverylow,anddidnotchangebetweenhigher-andlower-rewardblocks(Fig.3d).Norwasthereanyoverallchangeintherateatwhichdopaminecellsfireburstsofspikes(SupplementaryFig.6).WeconcludethatchangesintonicVTAdopaminecellfiringarenotresponsibleforthemotivation-linkedchangesinforebraindopaminereleaseobservedinthistask.

Wenextconsideredfluctuationsindopaminecellfiringaroundspecificbandittaskevents.SeveralgroupshavefoundthatmotivatedapproachbehaviorsareaccompaniedbyrapidincreasesinNAccoredopamine,onasub-secondtosecondstimescale3-6.Inthisspecifictaskwefind7thatNAccoredopaminerapidlyincreasesasratsinitiallyapproachCenter-In(Fig,4a),andincreasesfurthertowardstheendofrewardedtrialsastheyapproachFood-Port-In(Fig.4b).TheinitialincreaseisbetteralignedonCenter-InthanLight-On(in6/6voltammetryanimals;forindividualanimaldataseeref7),andoccursforalllatencies(Fig.4a).

ThepatternofVTA-ldopaminecellfiringwasverydifferent(Fig.4c,d).TheLight-On,Go-Cue,

and(onrewardedtrials)Side-Ineventsallproducedfastincreasesintheactivityofmostdopamineneurons(Fig.4e).Critically,thesefiringchangeswerebestalignedtothesensorycues,ratherthanthebehaviorstheyevoked.Twenty-twoVTAdopamineneuronssignificantlyincreasedfiringafterLight-On;inallcasesthisresponsewasbetteralignedtoLight-OnthanCenter-In,andwaslargestforshort-latencytrials(Fig4c).NoseparateincreaseindopaminecellfiringwasapparentaroundCenter-In,eitheratthepopulationlevel(Fig.4c)orindividually(SupplementaryFig.5).DuringthesubsequentapproachtoFood-Port-Indopaminecellfiringwasmorevariable(SupplementaryFig.5)butagainshowednooverallincrease(Fig.4d).

TheresponseofdopaminecellstotherewardcueatSide-Independedonrecentrewardhistory,inamannerconsistentwithRPEcoding.Whenrewardratewaslow(i.e.ratshadlower

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 4: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

expectationofreward),dopaminecellsrespondedstrongly,butthisresponsewasgreatlybluntedwhenrewardratewashigh(Fig.4f,g).TheseRPE-likeresponsesappearedverysimilarifrewardexpectationwasestimatedinotherways,includingtrial-basedreinforcementlearningmodels(actor-criticorQ-learning)orsimplycountingthenumberofrewardsinthelast10trials(SupplementaryFig.8).RPEcodingonunrewardedtrialswasalsovisible,butmuchlessrobust(Fig.4f).Noneofthe27dopamineneuronsshowedasignificantindividualcorrelationbetweenrewardrateandtheminimumfiringratefollowingrewardomission(Fig.4g;allp>0.01aftermultiplecomparisonscorrection).IthasbeenproposedthatnegativeRPEsmaybeencodedinthedurationofdopaminecellpauses27,butthiswasobservedinjust2/27individualneurons(Fig.4g,right).SuchasymmetricRPEcodinghasbeenobservedbefore22,28,29andprovidesanotherdissimilaritytoNAcdopaminerelease,whichshowsalargerandmoresustaineddecreaseaftermoredisappointingoutcomes7,30.

Despitetheinfluenceofrewardexpectationovertherats’motivationtoperformthetask(Fig.

3c)dopaminecellfiringwasnotdependentonrewardexpectationuntilratsheardtherewardcueatSide-In(Fig.4f;SupplementaryFig.8).Tofurthercomparetheimpactofrewardhistoryondopaminefiringversusreleaseweemployedaconsecutive-trialanalysis7.Receivingarewardhadnoimpacton“baseline”dopaminecellfiringratesearlyinthesubsequenttrial(Fig.4h).Instead,itreducedthemagnitudeofthepeakresponsetoasubsequentrewardcue,consistentwith(positive)RPEcoding.ThispatternisunlikeNAccoredopaminerelease.ThepeakNAccoredopamineresponsetotherewardcueisunchangedbyrewardontheprecedingtrial(Fig.4h),consistentwithencodingvalueratherthanRPE7.Overall,weconcludethatVTA-ldopaminecellfiringdoesnotaccountforthemotivation-relatedpatternsofdopaminereleaseweobserveinNAccore.

SpikingofVTAdopaminecellsisundoubtedlyimportantforNAcdopaminerelease31.However,localreceptorsonNAcdopamineterminalsalsopowerfullymodulaterelease32-35,evenwhenVTAspikingissuppressed36,37.Ithasbeennotedfordecadesthatthesetwodopaminecontrolmechanismsmightservedifferentfunctionalrolesinbehavior32,38.However,toourknowledgemidbraindopaminecellfiringandforebraindopaminereleasehavenotpreviouslybeencomparedinthesamebehavioralsituation.Ourresultsheredemonstratethatrecordingdopaminecellsisnotsufficientforunderstandingthefunctionalinformationconveyedbydopaminetransmission.

VTA-lprovidesthepredominantsourceofdopaminetoNAccore14,15.Asshownhereandelsewhere16VTA-ldopaminecellshaverelativelyuniform,RPE-likeactivitypatterns,butthereisincreasingevidencethatotherdopaminesubpopulationsmaycarrydistinctsignals18,39,40.Wecannotruleoutthepossibilitythatfiringofdopaminecellsubpopulationsnotrecordedfromhereisresponsibleforvalue-relateddopaminereleaseinNAccore.However,value-relatedfiringhasneverbeenreportedforanydopaminecells,acrossawiderangeofstudiesinrodents26andnon-humanprimates.ItalsoseemsunlikelythatanunidentifieddopaminesubpopulationwoulddominateNAccorerelease,overwhelmingtheextensive,well-characterizedinputfromVTA-l.

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 5: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

Ourfindingsargueagainsttheidea25thatincreased“tonic”dopaminecellfiringisresponsible

forincreasedmesolimbicdopaminewithmotivationalarousal,asmeasuredwithmicrodialysis41.Althoughtonicfiringcanbealteredbylesionsordrugmanipulations8,wearenotawareofevidencethatdopaminecellfiringshowsprolongedchangesunderanynaturalbehavioralcondition.Firinghasbeenseentorampdownwardsona~1stimescaleasmonkeysanticipatemotivationally-relevantevents42,43.However,thisdeclineistheoppositeofwhatwouldberequiredtoboostdopaminereleasewithrewardexpectation,andinsteadseemsmoreakintoasequenceoftransientnegativepredictionerrors44.

IthasbeensuggestedthatdopaminereleasearisingfromRPE-codingphasicdopamineburstscouldtemporallysummate45,resultinginatonicdopaminesignalthatencodesrewardrate(justliketheleakyintegratormetricweusedhere).Thereareseveralreasonstothinkthisisnotthecase.First,increasesinNAccoredopamineafterunpredictedrewardsarehighlytransient(subsecondduration46),consistentwithefficientclearingofdopaminefromtheextracellularspace7,47.Second,weobservednooveralldifferenceintherateofdopaminecellburstingbetweenhigher-andlower-rewardblocks,suggestingthatburstsarenotaneffectivewayoftrackingrewardsovertime.Finally,althoughdopaminereleasemaybeparticularlydrivenbyburstsinanesthetizedanimals47,tooursurpriseduringthebandittaskweobservedtheopposite.BurstfiringofVTAdopaminecellsdoesproducetransientNAccoredopamineincreases,butthesearede-emphasizedinfavoroftherampsthataccompanyapproachbehaviors.

Howcloselydopaminereleasewithinaparticularforebrainareacorrespondstomidbrain

dopaminecellfiringlikelydependsonthespecificbehavioralcontext.Distinctstriatalsubregionscontributetodifferenttypesofdecisions,andmayinfluencetheirowndopaminereleaseaccordingtoneed48.TheNAccoreisnotneededforhighly-trainedbehavioralresponsestoconditionedstimuli49-51butisparticularlyimportantwhendecidingtoperformtime-consumingworktoobtainrewards52.NAccoredopamineappearstoprovideanessentialdynamicsignalofhowworthwhileitistoallocatetimeandefforttowork7,48,eventhoughthissignalisnotpresentindopaminecellfiring.________________________________________________________Acknowledgements.WethankPeterDayan,LorenFrank,ChrisDonaghue,andThomasFaustfortheircommentsonanearlyversionofthemanuscript,andRahimHashimfortechnicalassistance.ThisworkwassupportedbytheNationalInstituteonDrugAbuse,theNationalInstituteofMentalHealth,theNationalInstituteonNeurologicalDisordersandStroke,theUniversityofMichiganAnnArbor,andtheUniversityofCaliforniaSanFrancisco.Contributions.A.M.performedandanalyzedtheelectrophysiology,J.P.performedandanalyzedthemicrodialysis,andA.H.performedandanalyzedthevoltammetry.MicrodialysisprocedureswereassistedbyJ.W.andsupervisedbyR.K.J.D.B.designedandsupervisedthestudy,andwrotethemanuscript.CompetingInterests.TheAuthorsdeclarenocompetinginterests.

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 6: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

Methods.Animals.AllanimalprocedureswereapprovedbytheUniversityofMichiganorUniversityofCaliforniaSanFranciscoInstitutionalCommitteesonUseandCareofAnimals.Malerats(300–500g,eitherwild-typeLong-EvansorTH-Cre+withaLong-Evansbackground19)weremaintainedonareverse12:12light:darkcycleandtestedduringthedarkphase.Ratsweremildlyfooddeprived,receiving15gofstandardlaboratoryratchowdailyinadditiontofoodrewardsearnedduringtaskperformance.Behavior.Pretrainingandtestingwereperformedincomputer-controlledMedAssociatesoperantchambers(25cm×30cmatwidestpoint)eachwithafive-holenose-pokewall,aspreviouslydescribed7.Bandittasksessionsusedthefollowingparameters:blocklengthswere35-45trials,randomlyselectedforeachblock;holdperiodbeforeGocuewas500-1500ms(uniformdistribution);left/rightrewardprobabilitieswere10,50,90%(electrophysiologyrats,andpreviouslyreported7voltammetryandmicrodialysisrats),or20,50,80%(newlyreportedmicrodialysisrats).ElectrophysiologyratsalsoperformedaPavlovianapproachtaskimmediatelyafterthebandittask,inthesameoperantchamberwiththehouselightonthroughoutthesession.Threeauditorycues(2kHz,5kHz,9kHz)wereassociatedwithdifferentprobabilitiesoffooddelivery(counterbalancedacrossrats).Cueswereplayedasatrainoftonepips(100mson/50msoff)foratotaldurationof2.6sfollowedbyadelayperiodof500ms.Cues,andunpredictedrewarddeliveries,weredeliveredinpseudorandomorderwithavariableinter-trialinterval(15-30s,uniformdistribution).

Currentrewardratewasestimatedusingasimple,time-basedleaky-integrator53.Rewardratewasincrementedeachtimearewardwasreceived,anddecayedexponentiallyataratesetbyparameterτ(thetimeinsfortherewardratetodecreaseby~63%,1-1/e).Forallanalyses,τwasselectedbasedontherat’sbehavior,maximizingthe(negative)correlationbetweenrewardrateandlog(latency)ineachsession.Thecorrelationsbetweenforebraindopamineandrewardratewerenothighlysensitivetothischoiceofτ(SupplementaryFig.1).Toclassifyblocktransitionsas“increasing”or“decreasing”inrewardrate,wecomparedtheaverageleaky-integratorrewardrateinthelast5minofablocktotheaveragerewardrateinthefirst8minofthesubsequentblock.Microdialysis.Surgery.Ratswereimplantedbilaterallywithguidecannula(CMA,#8309024)incortexandstriatum.Onegroup(n=8)receivedoneguidecannulatargetingprelimbicandinfralimbiccortex(AP+3.2mm,ML0.6mmrelativetobregma;DV1.4mmbelowbrainsurface)andanothertargetingdorsomedialstriatumandnucleusaccumbensintheoppositehemisphere(AP+1.3,ML1.9,DV3.4).Bothimplantswereangled5degreesawayfromeachotheralongtherostral-caudalplane.Asecondgroup(n=4)receivedoneguidecannulatargetinganteriorcingulatecortex(AP+1.6,ML0.8,DV0.8)andanothertargetingaccumbens(core/shellintheoppositehemisphereatAP+1.6,ML1.4,DV5.5(n=2)orAP+1.6,ML1.9,DV5.7(n=2).Implantsideswerecounterbalancedacrossrats.Animalswereallowedtorecoverfor1weekpriortoretraining.

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 7: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

Chemicals.Water,methanol,andacetonitrileformobilephaseswereBurdick&JacksonHPLCgrade,purchasedfromVWR(Radnor,PA).AllotherchemicalswerepurchasedfromSigmaAldrich(St.Louis,MO)unlessotherwisenoted.Artificialcerebrospinalfluid(aCSF)wascomprisedof145mMNaCl,2.68mMKCl,1.40mMCaCl2,1.01mMMgSO4,1.55mMNa2HPO4,and0.45mMNaH2PO4,adjustedpHto7.4withNaOH.Ascorbicacid(250nMfinalconcentration)wasaddedtoreduceoxidationofanalytes. SampleCollectionandHPLC-MS.Ontestingday,animalswereplacedintheoperantchamberwiththehouselighton.Custom-madeconcentricpolyacrylonitrilemembranemicrodialysisprobes(1mmdialyzingAN69membrane;Hospal,Bologna,Italy)wereinsertedbilaterallyintoguidecannulaandperfusedcontinuously(ChemyxInc.,Fusion400)withaCSFat2µL/minfor90mintoallowequilibration.After5minbaselinecollectionthehouselightwasextinguished,cueingtheanimaltobandittaskavailability.Samplecollectioncontinuedat1minintervalsandsampleswereimmediatelyderivatizedwith1.5µLsodiumcarbonate,100mM;1.5µLBzCl,2%(v/v)BzClinacetonitrile;and1.5µLisotopicallylabeledinternalstandardmixturedilutedin50%(v/v)acetonitrilecontaining1%(v/v)sulfuricacid,andspikedwithdeuteratedAChandcholine(C/D/Nisotopes,Pointe-Claire,Canada)toafinalconcentrationof20nM.Sampleseriescollectionalternatedbetweenthetwoprobesat30-secondintervalsineachof26sessions,exceptforonesessioninwhichabrokenmembraneresultedinjustoneseries(51sampleseriestotal).SampleswereanalyzedusingThermoFisherAccelaUHPLCsystemorThermoFisherVanquishUHPLCinterfacedtoaThermoFisherTSQQuantumUltratriplequadrupolemassspectrometerfittedwithaHESIIIESIprobe,operatinginmultiplereactionmonitoring.FiveµLsampleswereinjectedontoaPhenomenexcore-shellbiphenylKinetexHPLCcolumn(2.1mmx100mm).MobilephaseAwas10mMammoniumformatewith0.15%formicacid,andmobilephaseBwasacetonitrile.Themobilephasewasdeliveredanelutiongradientat450µL/minasfollows:initial,0%B;0.01min,19%B;1min,26%B;1.5min,75%B;2.5min,100%B;3min,100%B;3.1min,5%B;and3.5min,5%B.ThermoXcaliburQuanBrowser(ThermoFisherScientific)wasusedtoautomaticallyprocessandintegratepeaks.Eachofthe>100,000peakswerevisuallyinspectedtoensureproperintegration.

Analysis.Allneurochemicalconcentrationdataweresmoothedwitha3-pointmovingaverage(y’=[0.25*(y-1)+0.5(y)+0.25*(y+1)])andz-scorenormalizedwithineachsessiontofacilitatebetween-sessioncomparisons.Foreachtargetregion,across-correlogramwasgeneratedforeachsessionandtheaverageofthesessionswasplotted.1%confidenceboundariesweregeneratedforeachsubplotbyshufflingonetimeseries100,000timesandgeneratingadistributionofcorrelationcoefficientsforeachsession.MultipleregressionmodelsweregeneratedusingtheregressfunctioninMATLAB,withtheneurochemicalastheoutcomevariableandbehavioralmetricsaspredictors.Regressioncoefficientsweredeterminedsignificantatthreealphalevels(0.05,0.0005,0.000005),afterBonferroni-correctionformultiplecomparisons(alpha/(21chemicals*7regions*9behavioralregressors)).Electrophysiology.Rats(n=23)wereimplantedwithcustomdesigneddrivableoptrodes,eachconsistingof16tetrodes(constructedfrom12.5µmnichromewire,Sandvik,PalmCoast,FL)gluedontothesideofa200µmopticfiberandextendingupto500µmbelowthefibertip.Duringthe

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 8: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

samesurgery,weinjected1µlofAAV2/5-EF1a-DIO-ChR2(H134R)-EYFPintothelateralVTA(AP-5.6,ML0.8,DV7.5).Wideband(1-9000Hz)brainsignalsweresampled(30,000samples/s)usingIntandigitalheadstages.Optrodeswereloweredatleast80µmattheendofeachrecordingsession.IndividualunitswereisolatedofflineusingaMATLABimplementationofMountainSort54followedbycarefulmanualinspection.

Classification.ToidentifywhetheranisolatedVTA-lunitwasdopaminergic(TH+),weusedthestimulus-associatedlatencytest17.Briefly,attheendofeachexperimentalsession,weconnectedtheoptrodethroughapatchcabletoalaserdiodeanddeliveredlightpulsetrainsofdifferentwidthsandfrequencies.Foraunittobeidentifiedaslightresponsiveitneededtoreachthesignificancelevelofp<0.001for5msand10mspulsetrains.Wealsocomparedthelightevokedwaveforms(within10msoflaserpulseonset)tosession-wideaverages;alllight-evokedunitshadaPearsoncorrelationcoefficientof>0.9.Dopamineneuronsweresuccessfullyrecordedfromfourrats(IM657,1unit;IM1002,3units;IM1003,15units;IM1037,9units).Peakwidthwasdefinedasthefull-width-at-half-maximumofthemostprominentnegativecomponentofthealigned,averagedspikewaveform.Non-taggedVTAneuronswithsession-widefiringrate>20Hzandpeakwidth<200µswereclassifiedasnon-dopaminecells.Toensurethatwewerecomparingdopamineandnon-dopaminecellswithinthesamesubregions,weonlyanalyzednon-dopaminecellsrecordedduringsessionswithatleastoneoptically-taggeddopaminecell.

Analysis.Forcomparisonof“tonic”firingtorewardrate,dopaminespikeswerecountedin1minbins.Toexaminefasterchanges,spikedensityfunctionswereconstructedbyconvolvingspiketrainswithaGaussiankernelwithvariance20ms.Todeterminehowquicklyaneuronrespondedtoagivencue,weused40msbins(slidinginstepsof20ms)andusedashuffletest(10,000shuffles)foreachtimebincomparingthefiringrateaftercueonsettofiringrateinthe250msimmediatelyprecedingthecue.Thefirstbinatwhichthepostcuefiringratewassignificantly(p<0.01,correctingformultiplecomparisons)greaterthanbaselinefiringwasconsideredthetimetocueresponse.Peakfiringratewascalculatedasthemaximum(Gaussian-smoothed)firingrateofeachtrialina250mswindowafterSide-Inforrewardedtrials,andthevalleywascalculatedastheminimumfiringrateina2swindow,startingonesecondafterSide-Inforunrewardedtrials.Tocomparefiringratesin“high”and“low”rewardblocks,foreachsessionweperformedamediansplitofaverageleaky-integratorrewardrateineachblock.Voltammetry.Fast-scancyclicvoltammetryresultsshownherereanalyzedatapreviouslypresentedanddescribedindetail7.DataandCodeAvailability.Alldataisavailable[attimeofpublication]throughtheCollaborativeResearchinComputationalNeuroscience(CRCNS.org)datasharingwebsite.CustomMATLABcodeisavailableuponrequesttoJ.D.B.

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 9: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

FigureLegends:Figure1:DopaminereleasecovarieswithrewardrateinNAccoreandventralprelimbic

cortex.a.Sequenceofbandittaskevents.b.Examplesession.Toprow,rewardprobabilitiesineachblock(left:rightchoices);Nextrow,tickmarksindicateoutcomeofeachtrial(tallticks,rewarded;shortticks,unrewarded).Nextrow,leaky-integratorestimateofrewardrate(black)andrunning-averageoflatency(cyan;logscale).Bottom,NAccoredopaminetimeseriesinthesamesession(1minsamples).c.Regressionanalysisshowingdependencyof(log-)latencyontheoutcomeofrecenttrials,duringmicrodialysissessions(n=26sessions,7113trials,from12rats;errorbarsshowSEM).d.Top,locationsofmicrodialysisprobesinmedialfrontalcortexandstriatum.n=51probelocations,from12rats,eachwithtwomicrodialysisprobesthatwereloweredfurtherbetweensessions.Colorofbarindicatesstrengthofcorrelationbetweendopamineandrewardrate(samedataasc;seealsoSupplementaryFig.1).ACC,anteriorcingulatecortex;dPL,dorsalprelimbiccortex;vPL,ventralprelimbiccortex;IL,infralimbiccortex;DMS,dorsal-medialstriatum.Middle,averagecross-correlogramsbetweendopamineandrewardrateineachregion.Redbarsindicatethemean99%confidenceintervalgeneratedfromshuffledtimeseries.Bottom,relationshipsbetweenarangeofneurochemicalsandrewardrateasdeterminedthroughmultipleregressionanalysis.NotethattherelationshipbetweendopamineandrewardratewashighlysignificantinvPLandNAccore,notelsewhere(forrelationshipstootherbehavioralvariables,seeSupplementaryFig.2).e.Effectofblocktransitionsonrewardrate(top),latency(middle)andNAccoredopamine(bottom).Alldataisfromthe14sessionsinwhichNAccoredopaminewasmeasured(oneperrat,combiningnewandpreviouslyreported7data).Transitionswereclassifiedbywhethertheexperiencedrewardrateincreased(n=25)ordecreased(n=33).Datawerebinnedinto3minepochs,discardingtheoneminutesamplethatincludedthetransitiontime,andplottedasmean+-SEM.f.Compositemapsofcorrelationsbetweendopamineandrewardratefromallmicrodialysisexperiments(n=19rats,33sessions,58probeplacements).

Figure2:Optogeneticidentificationofdopamineneurons.a.Left,eachoptrodeconsisted

of16tetrodesarrangedarounda200µmopticfiber.Right,exampleofhistologicalverificationofoptrodeplacementwithinlateralVTA.Scalebar=1mm.Red=immunostainingforthedopaminecellmarkertyrosinehydroxylase;green=ChR2-EYFP;yellow=overlap.Forthelocationsofalldopamineneurons,seeSupplementaryFig.3.b.Left,exampleofoptogeneticstimulationofaVTAdopamineneuron.Asbluelaserpulsedurationincreased,theneuronfiredearlierandmorereliably(forquantificationseeSupplementaryFig.4).Right,Scatterplotofsession-widefiringrate(x-axis)versuswidth(athalf-maximum)ofaveragedspikewaveformsforeachunit.Taggeddopaminecellsareinblue;purpleindicatesadistinctclusterofconsistentlyuntagged,presumednon-dopaminergicneuronswithnarrowwaveformsandhigherfiringrate(>20Hz).Insetsshowexamplesofaveragewaveforms(foralldopamineandnon-dopaminewaveformsseeSupplementaryFig.4).c.Pavlovianapproachtaskwasruninthesameapparatusasthebandittask,butwiththehouselighton.d.Top,exampleofconditionedapproachbehaviorduringonePavloviansession.“Headentry%”indicatesproportionoftrialsforwhichtheratwasatthefood

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 10: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

portateachmomentintime.Red,blueindicaterewarded,unrewardedtrials.Thisratwasmorelikelytogotothefoodportduringthecuethatwashighly(75%)predictiveofrewardscomparedtotheothercues(25%and0%);forthesessionshown,one-wayANOVA,F=11.1,p<1.2x10-6.Unpredictablerewarddelivery(right)promptsrapidapproach.Bottom,rasterplotsandperi-eventtimehistogramsfromanidentifieddopamineneuronduringthatsamesession.e.Averagedfiringforalltaggeddopaminecells(n=27)inthistask.“High”/”Medium”toneswereeither75%/25%predictiveofreward(n=9cells),or100%/50%(n=18)respectively.DataoneachdopamineneuronispresentedinSupplementaryFig.5.

Figure3:Tonicfiringofdopaminecellsisunrelatedtomotivation.a.Firingrate(dark

blue)ofoneidentifiedVTAdopamineneuronduringbandittaskperformance.Latency(cyan)covarieswithrewardrate,butfiringratedoesnot.b.ScatterplotshowingfiringrateforallVTAneurons(blue=taggeddopaminecells;purple=non-dopaminecells;grey=unclassified)inlowvshighrewardrateblocks.Noneshowedsignificantdifferencesinfiring(Wilcoxonsignedranktestusing1-minbinsoffiringrate,allp>0.05aftercorrectingformultiplecomparisons).c.Analysisofrewardrate,latencyanddopaminefiringratechangesatblocktransitions(sameformatasFig.1e).n=95rewardrateincreasesand76decreases.d.Analysisofinterspikeintervals(ISIs).Left,overallISIdistributionsareunchangedbetweenhigher-andlowerrewardrateblocks.Right,proportionoftimespentinactive(definedasISI>2s)isunchangedbetweenlower-andhigherrewardrateblocks.Circlesindicateindividualdopaminecells(n=27,sameneuronsasFig.2),barsindicatemeanvalues.Notelogscale.

Figure4:PhasicfiringofVTA-ldopaminecellsdoesnotaccountforNAccoredopaminerelease.a.Left,event-alignedNAccoredopaminerelease(reanalysisofvoltammetrydatafromref.7;n=6rats,mean+-SEM).Greencolorsindicatedifferentlatencies,inequalterciles.Dataarenormalizedbytheaveragepeakdopamineconcentrationforrewardedtrialsineachsession,andshownrelativetoa2s“baseline”epochending1sbeforeCenter-In.ArobustdopamineincreaseoccursshortlybeforeCenter-In,foralllatencies.Right,scatterplotcomparespeakdopaminealignedoneitherLight-On(y-axis)orCenter-In(x-axis).Connectedlinesindicatelatencytercilesforthesameanimal.Peakswereconsistentlylarger(i.e.alignmentwasbetter)forCenter-In(2-wayANOVAwithfactorsofLatencyandAlignment,AlignmentF=3.87,p=0.05,Latencyn.s.F=0.82p=0.79).b.Samevoltammetrydataasa,alignedonlatereventsanddividedintorewarded(red)andunrewarded(blue)trials.c,d,Asa,bbutfordopaminecellspiking.Toppanelsshowspikerasterplotsforonerepresentativedopamineneuron,bottompanelsshowfiringrateaveragedoveralldopaminecells.Connectedlinesinthescatterplotindicatelatencytercilesforthesameneuron.ALight-Onresponseispreferentiallyseenforshort-latencytrials;thissamecue-evokedresponseappearssmallerandspread-outintheCenter-Inalignment(2-wayANOVAwithfactorsofLatencyandAlignment,AlignmentxLatencyinteractionF=7.47,p=0.0008).NootherdopaminecellfiringincreaseisvisibleatCenter-In.e.Top,Cumulativedistributionsoftimetakenfordopaminecellstosignificantlyincreasefiringfollowingeachofthreecueonsets(Light-On,Go-Cue,rewardedSide-In).ForLight-On,onlytheshort-latencytercilewasincluded.TheslowerresponsetoLight-Onisconsistentwithpriorreports55thatvisualcuestakelongertoevokedopaminecell

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 11: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

firingcomparedtoauditorycues.Bottom,scatterplotcomparingcellfiringresponsestothefoodhopperclickinthePavloviantask(x-axis,unpredictedtrialsonly)andthebandittask(y-axis).AsinFig.2,dopaminecellsareinblue,non-dopamineinpurple,unclassifiedingrey.Non-dopaminecellsweregenerallyindifferenttocueonsets,insteadincreasingfiringinconjunctionwithmovements(SupplementaryFig.7).f.Rewardexpectationaffectsdopaminecellfiringduring,butnotbefore,rewardfeedback.LowerpanelshowsaveragedopaminecellfiringraterelativetoSide-Inevent,brokendownbyrewardrate(terciles,calculatedseparatelyforeachneuronthenaveraged).Upperplotsshowthefractionofindividualdopaminecellswhosefiringratesignificantlyvarieswithrewardrateateachmomentintime,withboldmarksindicatingaproportionsignificantlyhigherthanchance(binomialtest,p<0.01).Dataareseparatedintorewarded(red)andunrewarded(blue)trials.NotethatbeforeSide-Indopaminecellfiringdidnotdependonrewardrate.AtSide-In,thedopaminecellresponsetotherewardclickwasmuchstrongerwhenrewardratewaslow,consistentwithRPEcoding(lowerrewardexpectation=largerpredictionerroriftherewardcuearrives).Whentherewardclickwasomitteddopaminecellstransientlyreducedfiring.g.Correlationsbetweenrewardrateandindividualdopaminecellpeakfiringrate(within250msafterrewardedSide-In),minimumfiringrate(middle;within2safterunrewardedSide-In),andpauseduration(bottom;maximuminter-spike-intervalwithin2safterunrewardedSide-In).Forallhistograms,greyindicatescellswithsignificantcorrelations(p<0.01)beforemultiplecomparisonscorrection,blackindicatescellsthatremainedsignificantaftercorrection.PositiveRPEcodingisstrongandconsistent,butnegativeRPEcodingisweak.h.Comparisonbetweenconsecutivetrialsshowsthatanunexpectedrewardedtrial(onethatoccurswhenrewardratehasbeenlow)causespeakdopaminecellfiringtobesubstantiallydiminishedonthenexttrial,buthasnoeffectonfiringrateearlierinthetrial(“baseline”,-3sto-1srelativetoCenter-In).ThisisconsistentwithpositiveRPEcoding.Bycontrast,unexpectedrewardsdonotreducepeakNAccoredopaminereleaseonthenexttrial.Insteadtheyincrease“baseline”,consistentwithvaluecoding.

SupplementaryFigure1.a.Anatomicaldefinitionsofthesubregionsexaminedwith

microdialysisareshownattopleft.Atlassectionsarefrom56.Theremainingsectionsmapthecorrelationbetweendopaminereleaseandrewardrateatindividualprobeplacementsincoronal(mmfrombregma,B)andsagittal(mmfrommidline)planes.Colorbarshowsstrengthofcorrelation.,b.Dependenceofthecorrelationbetweendopamineandrewardrateonthetimeconstant(tau)oftheleakyintegratorusedtodefinerewardrate.Asillustratedinthetoppanel,alargertauindicatesintegrationoverlongertimeperiods.Below,thedopamine:rewardratecorrelationevolvesasafunctionoftau.Inmainfigurestauwaschosen(fromarangeof1-1200s)tomaximizethe(negative)correlationbetweenrewardrateand(log)latencyineachsession.Thinlinesrepresentindividualsessions,withthebestfittauusedinregressionanalysesindicatedbyadot.Thicklinesindicatetheaverageofalldopamine:rewardratecorrelationsforagiventauwithineachsubregion.

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 12: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

SupplementaryFigure2.Correlationsbetweenallneurochemicalsandarangeofbehavioralfactors.BarsrepresentR2valuesforlineartestsbetweeneachanalyte(rows)andbehavioralcovariates(columns).Inmodelswithmorethanonecovariate,barlengthindicatestheR2forthefullmodel.Negativerelationshipsarereportedinblueandpositiverelationshipsareinred.P-valuesarereportedatthreealphalevels(0.05,0.0005,0.000005)andwereBonferronicorrectedformultiplecomparisons(7subregionsx21analytesx12measures).Tocalculaterewardrate,weaveragedtheleaky-integrator-estimatedrewardratein1minbinsdefinedbythestartandendofeachdialysissample.‘Attempts’isthenumberofinitiatedtrials(includingtrialsthatresultedinanerror)ineachdialysisminute.Attemptsandrewardrateandaninteractiontermwerecombinedinasinglemodel(column2)toexaminewhetheraddingattemptscouldexplainadditionalvarianceintheanalytesignalthatcouldnotbeexplainedbyrewardratealone.“Latency”istheaverageofthe(log)-latencyineachminute.‘Exploit’istheproportionofchoicesofthehigherrewardprobabilityoption,inthelasthalfofblocksforwhichthetwoportshaddifferentprobabilities.‘Rewards’and‘Omissions’weredefinedasthenumberofrewardedandunrewardedtrialsineachmin,respectively.‘CumulativeRewards’and‘Time’wereincludedinthesameregressionmodeltoestimateprogressivefactorssuchassatiety,andpossibleslowtimescaleincreasesordecreasesinanalyteconcentrationacrossthesession.CumulativeRewardsrepresentsthetotalnumberofrewardsreceivedbytheendofthecurrentdialysisminute,andTimewassimplythenumberofminelapsedsincethesessionbegan.Barsinthiscolumnshowcolorwhenonlythecoefficientforthecumulativerewardvariablewassignificant.%Ipsiand%Contrarepresentthefractionofchoicestoipsi-orcontra-versiveports(relativetoprobelocationinthebrain)ineachminute,independentofblockprobability.P(win-stay)istheprobabilityofrepeatingthepreviouschoice,giventhepreviouschoicewasrewarded.

SupplementaryFigure3.Histologicalreconstructionofrecordinglocations.Left,

Histologyphotomicrographsforeachrat(IM-657,IM-1002,IM-1003,IM-1037)fromwhichopto-taggeddopaminecellswereobtained.Red:TH-staining;green:ChR2::eYFP;blue:DAPI.Scalebars:1mm.Numbersbeloweachphotographindicateestimatedatlascoordinatesofthelowestpositionoftheopticfiber.IM-1037brainwasslicedhorizontally,sofibertrackappearsasacircle.Right,coronalatlassectionswithestimateddopaminecelllocationsinVTA-lmarkedassmallhorizontalbars.

SupplementaryFigure4.Identificationoflight-responsiveunits.a.Averagewaveformsof

optogenetically-identifieddopamineneurons.Averagelight-evokedwaveformsareshowninblueandsession-wideaveragewaveformsareinblack.Allspikeswithin10msoflaseronsetwereusedtoconstructlight-evokedwaveformaverage.b.Session-wideaveragewaveformfornon-dopaminecells.c.Opto-taggingp-valueforallunitsplottedinlog-scale,showingastrongbimodaldistribution.Toclassifyunitsaslight-responsiveweusedathresholdofp<0.001.d.Timestofirstspikeafterlaseronset,showingmeanforeachidentifieddopamineneuron,andstandarddeviation(jitter).

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 13: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

SupplementaryFigure5.Propertiesofeachindividualidentifieddopaminecell(oneperpage).a.Averagelight-evokedspikewaveform(blue)andsession-wideaveragewaveform(black).b.Interspikeintervalhistogram(duringbandittask).c.Rasterplotshowingresponseto5mslaserpulses(deliveredat2Hz).d.Rasterplotwith10mslaserpulses(forcellsthatweretestedunderthiscondition).e.Scatterplot(asFig.2b),withthisneuronhighlightedinyellow.f.Behavior,andg.activityduringthePavlovianapproachtask.h.Firingrate,latencyandrewardrateduringthebandittask.i.AverageresponseofthiscelltothebandittaskSide-Inevent,brokendownbyrewardrateterciles.j.Spikerastersandfiringratehistogramsalignedtovariousbandittaskevents.

SupplementaryFigure6.OverallrateofVTA-ldopaminecellburstfiringisnotaffected

byrewardrate.a,Exampleofburstdetectionalgorithminaction.Weuseda“80/160template”approachthathaslongbeenthestandardmethodfordetectingdopaminecellbursts57.Eachtimeaninter-spike-intervalof80msorlessoccurs,theseandsubsequentspikesareconsideredpartofaburstuntilthereisanintervalof160msormore.Numbersindicatethenumberofspikesineachdetectedburst.b,Nochangeinoverallrateofburstsbetweenhigher-andlower-rewardrateblocksineachsession.Wilcoxonpairedtestz=0.82,p>0.4.c,Nochangeinburstratesacrossawidedistributionofspikes/burst.Kolmogorov-Smirnovstatistic=0.165,p>0.63).d,Latenciesindicatesubstantialshiftsinmotivationwithinthesamesessions.Wilcoxonpairedtestz=-4.28,p<1.8x10-5.

SupplementaryFigure7.DistinctactivitypatternsofVTA-ldopamineandnon-

dopamineneurons.FormatisasFig.4,exceptshowingbothnon-dopamineneurons(top)anddopamineneurons(bottom).Rastersagainshowonerepresentativeneuronofeachclass,andperi-eventhistogramsshowaverageforallneuronsofthatclass.Notethatthenon-dopaminecellsshowactivityduringmovements,startingjustbeforeCenter-In(irrespectiveoflatency),justbeforeSide-In,justbeforeFood-Port-In.FortheLight-OnversusCenter-Incomparison(scatterplot),2-wayANOVAwithfactorsofLatencyandAlignment,AlignmentF=48.9,p<0.0001,Latencyn.s.F=0.82p=0.44.

SupplementaryFigure8.Differentmethodsforcalculatingrewardexpectationproduce

similarresults.a,AsFig.4f,gexceptthatrewardexpectationwasestimatedusingeitherthenumberofrewardsinthelast10trials(top),anactor-criticmodel(middle),oraQ-learningmodel(bottom).Thetwomodelswerebothtrial-based,ratherthanevolvingcontinuouslyintime.Theactor-criticmodelestimatedtheoverallprobabilityofreceivingarewardoneachtrial,V,usingtheupdateruleV’=V+alpha(RPE),whereRPE=actualreward[1or0]–V.TheQ-learningmodelkeptseparateestimatesoftheprobabilitiesofreceivingrewardsforleftandrightchoices(QL,QR)andupdatedQforthechosenaction(only)usingQ’=Q+alpha(RPE),whereRPE=actualreward[1or0]–Q.Thelearningparameteralphawasdeterminedforeachsessionbybestfittolatencies,forVor(QL+QR)respectively.b,CorrelationsbetweenRPEandfiringweresimilarregardlessofwhichmethodwasusedtoestimaterewardexpectation.

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 14: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 15: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

References:1.Schultz,W.,Dayan,P.&Montague,P.R.Aneuralsubstrateofpredictionandreward.Science275,1593-1599(1997).2.Cohen,J.Y.,Haesler,S.,Vong,L.,Lowell,B.B.&Uchida,N.Neuron-type-specificsignalsforrewardandpunishmentintheventraltegmentalarea.Nature482,85-88(2012).3.Phillips,P.E.,Stuber,G.D.,Heien,M.L.,Wightman,R.M.&Carelli,R.M.Subseconddopaminereleasepromotescocaineseeking.Nature422,614-618(2003).4.Roitman,M.F.,Stuber,G.D.,Phillips,P.E.,Wightman,R.M.&Carelli,R.M.Dopamineoperatesasasubsecondmodulatoroffoodseeking.JNeurosci24,1265-1271(2004).5.Wassum,K.M.,Ostlund,S.B.&Maidment,N.T.Phasicmesolimbicdopaminesignalingprecedesandpredictsperformanceofaself-initiatedactionsequencetask.BiolPsychiatry71,846-854(2012).6.Howe,M.W.,Tierney,P.L.,Sandberg,S.G.,Phillips,P.E.&Graybiel,A.M.Prolongeddopaminesignallinginstriatumsignalsproximityandvalueofdistantrewards.Nature500,575-579(2013).7.Hamid,A.A.,Pettibone,J.R.,Mabrouk,O.S.,Hetrick,V.L.,etal.Mesolimbicdopaminesignalsthevalueofwork.NatNeurosci19,117-126(2016).8.Floresco,S.B.,West,A.R.,Ash,B.,Moore,H.&Grace,A.A.Afferentmodulationofdopamineneuronfiringdifferentiallyregulatestonicandphasicdopaminetransmission.NatNeurosci6,968-973(2003).9.Song,P.,Mabrouk,O.S.,Hershey,N.D.&Kennedy,R.T.Invivoneurochemicalmonitoringusingbenzoylchloridederivatizationandliquidchromatography--massspectrometry.Analyticalchemistry84,412-419(2011).10.Garris,P.A.&Wightman,R.M.Differentkineticsgoverndopaminergictransmissionintheamygdala,prefrontalcortex,andstriatum:aninvivovoltammetricstudy.JNeurosci14,442-50.(1994).11.Frank,M.J.,Doll,B.B.,Oas-Terpstra,J.&Moreno,F.Prefrontalandstriataldopaminergicgenespredictindividualdifferencesinexplorationandexploitation.NatNeurosci12,1062-1068(2009).12.Knutson,B.,Taylor,J.,Kaufman,M.,Peterson,R.&Glover,G.Distributedneuralrepresentationofexpectedvalue.TheJournalofNeuroscience25,4806-4812(2005).13.Bartra,O.,McGuire,J.T.&Kable,J.W.Thevaluationsystem:acoordinate-basedmeta-analysisofBOLDfMRIexperimentsexaminingneuralcorrelatesofsubjectivevalue.Neuroimage76,412-427(2013).14.Ikemoto,S.Dopaminerewardcircuitry:twoprojectionsystemsfromtheventralmidbraintothenucleusaccumbens-olfactorytuberclecomplex.BrainResRev56,27-78(2007).15.Lammel,S.,Hetzel,A.,Häckel,O.,Jones,I.,etal.Uniquepropertiesofmesoprefrontalneuronswithinadualmesocorticolimbicdopaminesystem.Neuron57,760-773(2008).16.Eshel,N.,Tian,J.,Bukwich,M.&Uchida,N.Dopamineneuronssharecommonresponsefunctionforrewardpredictionerror.NatNeurosci19,479-486(2016).17.Kvitsiani,D.,Ranade,S.,Hangya,B.,Taniguchi,H.,etal.Distinctbehaviouralandnetworkcorrelatesoftwointerneurontypesinprefrontalcortex.Nature(2013).

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 16: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

18.Silva,J.A.D.,Tecuapetla,F.,Paixão,V.&Costa,R.M.Dopamineneuronactivitybeforeactioninitiationgatesandinvigoratesfuturemovements.Nature554,244(2018).19.Witten,I.B.,Steinberg,E.E.,Lee,S.Y.,Davidson,T.J.,etal.Recombinase-driverratlines:tools,techniques,andoptogeneticapplicationtodopamine-mediatedreinforcement.Neuron72,721-733(2011).20.Ungless,M.A.,Magill,P.J.&Bolam,J.P.Uniforminhibitionofdopamineneuronsintheventraltegmentalareabyaversivestimuli.Science303,2040-2042(2004).21.Morales,M.&Margolis,E.B.Ventraltegmentalarea:cellularheterogeneity,connectivityandbehaviour.NatRevNeurosci18,73-85(2017).22.Fiorillo,C.D.,Tobler,P.N.&Schultz,W.Discretecodingofrewardprobabilityanduncertaintybydopamineneurons.Science299,1898-1902(2003).23.Morris,G.,Arkadir,D.,Nevet,A.,Vaadia,E.&Bergman,H.Coincidentbutdistinctmessagesofmidbraindopamineandstriataltonicallyactiveneurons.Neuron43,133-143(2004).24.Tian,J.&Uchida,N.HabenulaLesionsRevealthatMultipleMechanismsUnderlieDopaminePredictionErrors.Neuron87,1304-1316(2015).25.Grace,A.A.Dysregulationofthedopaminesysteminthepathophysiologyofschizophreniaanddepression.NatureReviewsNeuroscience17,524(2016).26.Cohen,J.Y.,Amoroso,M.W.&Uchida,N.Serotonergicneuronssignalrewardandpunishmentonmultipletimescales.Elife4,(2015).27.Bayer,H.M.,Lau,B.&Glimcher,P.W.Statisticsofmidbraindopamineneuronspiketrainsintheawakeprimate.JNeurophysiol98,1428-1439(2007).28.Bayer,H.M.&Glimcher,P.W.Midbraindopamineneuronsencodeaquantitativerewardpredictionerrorsignal.Neuron47,129-141(2005).29.Gadagkar,V.,Puzerey,P.A.,Chen,R.,Baird-Daniel,E.,etal.Dopamineneuronsencodeperformanceerrorinsingingbirds.Science354,1278-1282(2016).30.Hart,A.S.,Rutledge,R.B.,Glimcher,P.W.&Phillips,P.E.Phasicdopaminereleaseintheratnucleusaccumbenssymmetricallyencodesarewardpredictionerrorterm.JNeurosci34,698-704(2014).31.Sombers,L.A.,Beyene,M.,Carelli,R.M.&Wightman,R.M.Synapticoverflowofdopamineinthenucleusaccumbensarisesfromneuronalactivityintheventraltegmentalarea.JNeurosci29,1735-1742(2009).32.Glowinski,J.,Chéramy,A.,Romo,R.&Barbeito,L.Presynapticregulationofdopaminergictransmissioninthestriatum.CellularandMolecularNeurobiology8,7-17(1988).33.Zhou,F.M.,Liang,Y.&Dani,J.A.Endogenousnicotiniccholinergicactivityregulatesdopaminereleaseinthestriatum.NatNeurosci4,1224-1229(2001).34.Threlfell,S.,Lalic,T.,Platt,N.J.,Jennings,K.A.,etal.Striataldopaminereleaseistriggeredbysynchronizedactivityincholinergicinterneurons.Neuron75,58-64(2012).35.Cachope,R.,Mateo,Y.,Mathur,B.N.,Irving,J.,etal.Selectiveactivationofcholinergicinterneuronsenhancesaccumbalphasicdopaminerelease:settingthetoneforrewardprocessing.CellRep2,33-41(2012).

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 17: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

36.Floresco,S.B.,Yang,C.R.,Phillips,A.G.&Blaha,C.D.Basolateralamygdalastimulationevokesglutamatereceptor-dependentdopamineeffluxinthenucleusaccumbensoftheanaesthetizedrat.EurJNeurosci10,1241-1251(1998).37.Jones,J.L.,Day,J.J.,Aragona,B.J.,Wheeler,R.A.,etal.Basolateralamygdalamodulatesterminaldopaminereleaseinthenucleusaccumbensandconditionedresponding.BiolPsychiatry67,737-744(2010).38.Schultz,W.Responsesofmidbraindopamineneuronstobehavioraltriggerstimuliinthemonkey.Journalofneurophysiology56,1439-1461(1986).39.Parker,N.F.,Cameron,C.M.,Taliaferro,J.P.,Lee,J.,etal.Rewardandchoiceencodinginterminalsofmidbraindopamineneuronsdependsonstriataltarget.NatNeurosci(2016).40.Menegas,W.,Babayan,B.M.,Uchida,N.&Watabe-Uchida,M.Oppositeinitializationtonovelcuesindopaminesignalinginventralandposteriorstriatuminmice.Elife6,(2017).41.Mark,G.P.,Smith,S.E.,Rada,P.V.&Hoebel,B.G.Anappetitivelyconditionedtasteelicitsapreferentialincreaseinmesolimbicdopaminerelease.PharmacolBiochemBehav48,651-660(1994).42.Bromberg-Martin,E.S.,Matsumoto,M.&Hikosaka,O.Distincttonicandphasicanticipatoryactivityinlateralhabenulaanddopamineneurons.Neuron67,144-155(2010).43.Pasquereau,B.&Turner,R.S.Dopamineneuronsencodeerrorsinpredictingmovementtriggeroccurrence.JournalofNeurophysiology113,1110-1123(2014).44.Fiorillo,C.D.,Newsome,W.T.&Schultz,W.Thetemporalprecisionofrewardpredictionindopamineneurons.NatNeurosci(2008).45.Cools,R.,Nakamura,K.&Daw,N.D.Serotoninanddopamine:unifyingaffective,activational,anddecisionfunctions.Neuropsychopharmacology36,98-113(2011).46.Day,J.J.,Roitman,M.F.,Wightman,R.M.&Carelli,R.M.Associativelearningmediatesdynamicshiftsindopaminesignalinginthenucleusaccumbens.NatNeurosci10,1020-1028(2007).47.Chergui,K.,Suaud-Chagny,M.F.&Gonon,F.Nonlinearrelationshipbetweenimpulseflow,dopaminereleaseanddopamineeliminationintheratbraininvivo.Neuroscience62,641-65.(1994).48.Berke,J.D.Whatdoesdopaminemean?NatureNeuroscience(2018).49.Brown,V.J.&Bowman,E.M.Discriminativecuesindicatingrewardmagnitudecontinuetodeterminereactiontimeofratsfollowinglesionsofthenucleusaccumbens.EurJNeurosci7,2479-2485(1995).50.Ikemoto,S.&Panksepp,J.Theroleofnucleusaccumbensdopamineinmotivatedbehavior:aunifyinginterpretationwithspecialreferencetoreward-seeking.BrainResBrainResRev31,6-41(1999).51.Nicola,S.M.Theflexibleapproachhypothesis:unificationofeffortandcue-respondinghypothesesfortheroleofnucleusaccumbensdopamineintheactivationofreward-seekingbehavior.JNeurosci30,16585-16600(2010).52.Salamone,J.&Correa,M.TheMysteriousMotivationalFunctionsofMesolimbicDopamine.Neuron76,470-485(2012).

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 18: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

53.Sugrue,L.P.,Corrado,G.S.&Newsome,W.T.Matchingbehaviorandtherepresentationofvalueintheparietalcortex.Science304,1782-1787(2004).54.Chung,J.E.,Magland,J.F.,Barnett,A.H.,Tolosa,V.M.,etal.AFullyAutomatedApproachtoSpikeSorting.Neuron95,1381-1394.e6(2017).55.Pan,W.X.&Hyland,B.I.Pedunculopontinetegmentalnucleuscontrolsconditionedresponsesofmidbraindopamineneuronsinbehavingrats.JNeurosci25,4725-4732(2005).56.Paxinos,G.&Watson,C.Theratbraininstereotaxiccoordinates(5thedition)(ElsevierAcademicPress,2005).57.Grace,A.A.&Bunney,B.S.Thecontroloffiringpatterninnigraldopamineneurons:burstfiring.JNeurosci4,2877-2890(1984).

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 19: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

a

Late

ncy

(Z-s

core

)

Figure 1

dopamineserotonin

norepinephrineacetylcholine

GABAglutamateadenosine

NMDOPAC

3-MTHVAHIAAglucose

glycineaspartateglutamine

serinetyrosine

histaminetaurine

vPL IL shellACC DMS coredPL0.4

0

-0.4

Late

ncy

(s)

n=7 n=8 n=7 n=4 n=10 n=7 n=8

Rew

Rat

e vs

[DA]

Cro

ss-c

orre

latio

n

-0.2 0.0 0.2-0.4 0.4 0.60 0.05 0.1

p < 0.05p < 0.0005

p < 0.000005

1

2

0

0.2

0.4

dopa

min

e (Z

-sco

re)

0 10 20 30 40 50 60 70 80 90

0 0.05 0.10 0.05 0.10 0.05 0.1 0 0.05 0.1 0 0.05 0.1 0 0.05 0.1

-0.2

c

d

eb

Correlation Coefficient

RewR

ate

(Z-s

core

)

Time from Transition (min)

-0.4

0

0.4

0.8

-0.8

0.2

0

-0.2

-0.4

-5 -2 2 5 8

Regr

essio

n we

ight

-0.2

-0.1

0

-0.3

-7 -6 -5 -4 -3 -2 -1Trial

f

-0.20

0.40.6

-0.4

0.2

Cor

rela

tion

Coe

ffici

ent

-20 +200min

decreasesincreases

decreasesincreases

decreasesincreases

-0.4

choline

20 : 80 (L:R)

Rewa

rd R

ate

R2

80 : 2080 : 8080 : 5050 : 5080 : 2020 : 5020 : 20blockrewards

[DA]

(nM

)

Light-On Center-In Go Cue Center-Out Side-In Side-Out Food-Port-In

------latency------

Time (min)

10

1

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 20: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

a

b

d

c

e

50

11

2

2

4

4

5

5

6

6

Rate (Hz)Time (ms)

Peak

width(µs)

20 40 60 80 100

3

3

Headentry

(%) 1s

0

50

100

100

200

Rate(Hz)

Rate(Hz)

0

0

30

80

0 4s

0dopamine non-dopamine

0.1ms

0.5ms

1ms

High Medium

300

400

75%

75%

25% 0%

0%

0%

Click

Click Click Click

Click Click Click

Tone

25%

Figure 2

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 21: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

Figure 3

a

b

d

c0

10

Reward Rate

Firing Rate

Latency(s)

90:10 50:10 50:90 10:90 50:50 10:50 90:50 10:10

FiringRate(Z-score)

Rew

ardRate(Z-score)

Latency(Z-score)

decreasesincreases

10

0

-3

10-1

High

Prop

(ISI>

2s)

Low

X (s)

Time (min)

p>0.96

Prop

(ISI>

X)

1 1000.010

1

10

1

10

10

100

100

Rate in low-reward blocks (Hz)

Rateinhigh-re

wardblocks

(Hz)

non-dopaminedopamine

20 40 60 80 1000 120

-5 -2 2 5 8-1

-0.5

0

0.5

1

-1

-0.5

0

0.5

1

-0.5

0

0.5

Time from Transition (min)

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint

Page 22: Ali Mohebi1*, Jeffrey Pettibone *, Arif Hamid , Jenny ... · Forebrain dopamine value signals arise independently from midbrain dopamine cell firing. Ali Mohebi1*, Jeffrey Pettibone1*,

c d

e f g

a∆Dopam

ine

Center-In aligned (Z-score)

Light-O

naligned(Z-score)

0

1

2

0

1 2

Rate(Hz)

Rate(Hz)

Cum

ulativeproportion

Proportion

Time Time

b

Light-O

naligned(Z-score)

0

4

8

0 4Center-In aligned (Z-score)

Peak Rate

0

510

1-1

Min Rate

0 1-1 1-1

Low Med High

Low Med High

Low Med High

Low Med High

Low Med High

Low Med High

5

Pause Duration

0

h

-6

-4

-2

0

-6

-4

-2

0

∆Rate(Hz)

-6

-4

-2

0

-0.3-0.2-0.10

0.10.2

-0.3-0.2-0.10

0.10.2

-0.3-0.2-0.10

0.10.2

∆Dopam

ine

∆ Baseline∆ Peak ∆ Valley

10-1

Time from Side-In (s)

60%

0

10-1

10

5

15

20

25

30

Rate(Hz)

0

LowMedHigh

ShortMediumLong

UnrewardedRewarded

-1 0 1s0

10

0

10

20

0

10

20 20

-1 0 1s

-1 0

0

0.8

1s -1 0 1s -1 0 1s

-1 0 1s -1 0 1s -1 0 1s

-1 0 1s-1 0 1s

Light-On Center-In

Reward Rate

Food-Port-InGo Cue Side-In

Light-On

Side-InGo Cue

81%88%96%

Unpredicted Click (Z-score)

Time to cue response (ms)500 150

100%

250

Side-In

(Z-score)

dopaminenon-dopamine

0

0

20

10

20

0

0.8

0

0.8

Correlation Correlation Correlation

8

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 30, 2018. . https://doi.org/10.1101/334060doi: bioRxiv preprint


Recommended