Brooks, Legault, Nitschke, O'Brien, Sosebee, Rago, and Seaver

WP 2.4 Evaluation of NMFS Toolbox Assessment Models on Simulated Groundfish Data Sets Comparative Simulation Tests Overview. Brooks, Legault, Nitschke, O'Brien, Sosebee, Rago, and Seaver. Nothing gives rest but the sincere search for truth. Blaise Pascal (French philosopher).

  • WP 2.4Evaluation of NMFS Toolbox Assessment Models on Simulated Groundfish Data Sets

  • What did we do?Evaluated 5 NFT stock assessment models for three stocks under 4 scenarios meant to examine potential difficulties in real assessmentsAIM, ASPIC, SCALE, VPA, ASAPGB yt (retro), GB cod (domes), white hake (ageing)PopSim used to generate true conditions and create 100 datasets with the same random errors for all 5 modelsEvaluated Accuracy and Precision of the 100 point estimates from the modelsDid not examine precision of each of the 100 runs60 scenarios6000 assessments

  • Why?Test hypothesis that all models are impacted similarly when presented with the same underlying problem

    A priori know that some models will not perform well under the test conditions because limiting data to VPA yearsNOT trying to declare one model winnerNOT trying to declare any model bad

  • PopSim PrimerAge and Length Based Population Simulator

    User definesDimensions (Years, Ages, Plus Group Age, Lengths)Initial NAARecruitment time series (or SRR)Annual Fmult and selectivityBiological CharacteristicsM, von B, L-WFishery SamplingSurveysSets Template for Stock Assessment Model

  • Surveys vs IndicesSurveys Are a property of the true populationCatchability defined for all ages and yearsUncertainty added to true values at age and lengthIndicesAre a property of the modelSum values from surveysCan be either number or biomass basedCan be limited age range or entire age rangeCan be changed between models without impacting underlying truth

  • GrowthInitial NAA distributed according to stdev1Growth transfer matrices created for each age based on expected von B growth for age and stdev2Fish not allowed to decrease in sizeAllows fishing to change distribution of length at age

  • Market SamplingMarkets declared by userSampling conducted per 100 mt of landings in each market each year

  • InputOutput

  • PopSim LimitationsPopSim is not realityAnnual Time StepsDoes not contain spatial componentsDoes not allow gender differencesDoes not allow density dependent effectsNo integrated managementDeveloping MSE wrapper to use PopSim, VPA, AgePro, and Control Rules

  • This ExerciseUsed utility to convert VPA run to PopSimGets Nyears, plus group age from VPASets initial NAA and R from VPASets annual Fmult from VPAEstimates one logistic selectivity from VPALength and biology stuff from userMarket stuff from userSurveys and Indices defs from userTuned markets, sampling, and surveys to represent actual assessments by lead

  • Farmed Out AssessmentsAIM RagoASPIC BrooksSCALE NitschkeVPA Legault, OBrien, SosebeeASAP Legault

    Used base case to get template settings reasonableApplied this base case to each of the test casesSome models did additional runs with modified templates to fix the problem

  • ResultsPopSim compares the distribution of 100 assessments with the known true valuesExactly what is compared depends on modelE.g. VPA NAA & FAA, ASPIC B & FMany, many runs and scenariosPopSim creates tables and graphsR program to gather results and automatically create plots

  • Started by looking at direct resultsBlack Line TrueCircles and Grey Line MedianRed dashed Lines 5 and 95%iles

  • Decided Bias and CV Better

  • General ConclusionsGiven failure of all models tested (simple & complex), we suspect other models would also be vulnerable to retrospective agentsUse of age-specific indices is robust to uncertainty in survey selectivityIf ageing is uncertain, these simulations support using models w/o age or models which allow uncertainty in catch at age

  • General Conclusions (cont.)VPA and ASAP failures were similar in pattern Magnitude of bias was less for ASAPPrecision usually somewhat better for ASAPGiven these similarities, we suggest that ASAP may offer some advantages to VPA (esp. in terms of flexibility)