23
More General Need different response curves for each predictor Need more complex responses

More General Need different response curves for each predictor Need more complex responses

Embed Size (px)

Citation preview

Page 1: More General Need different response curves for each predictor Need more complex responses

More General

• Need different response curves for each predictor

• Need more complex responses

Page 2: More General Need different response curves for each predictor Need more complex responses

Generalized Additive Models

• Adds functions to linearize each predictor variable

• = )• Functions can be parametric or non-

parametric: Including splines• Makes GAMS:

– Very general– Prone to over-fitting

Page 3: More General Need different response curves for each predictor Need more complex responses

𝑓 (𝑥 )={14

(𝑥+2)3−2≤ 𝑥≤−1

14

(3|𝑥|3−6 𝑥2+4 )−1≤𝑥 ≤114

(2− 𝑥 )31≤𝑥 ≤2

Spline Curves

Knots

Bell-shaped Irwin-Hall spline

Page 4: More General Need different response curves for each predictor Need more complex responses

Spline Curves in R

• Wrap predictors in a spline function:– s(predictor)

• Use “gamma” parameter to set the number of knots– Controls over-fitting– 1.4 is recommended

• In R:– TheModel=gam(Height~s(AnnualPrecip),

data=TheData,gamma=1.4)

Page 5: More General Need different response curves for each predictor Need more complex responses

Reading

• Read Hastie and Tibshirani when you have “time”– “All considered, it is conceivable that in a

minor way, nonparametric regression might, like linear regression, become an object treasured for both its artistic merit as well as usefulness”• L. Breiman, 1977

• Read Martinez-Rincon and Jensen for next time

Page 6: More General Need different response curves for each predictor Need more complex responses

Which Approach?

GAM Kernel Smoother

IncomeIncome AgeAge

Hastie and Tibshirani 1986, Generalized Additive Models

Z-axis shows the proportion of families with a telephone at home

Page 7: More General Need different response curves for each predictor Need more complex responses

GAM Plots in RModeled Response Curve

95% CI

Sample point “Grass”

FIA Doug-Fir height data vs. BioClim Annual Precipitation

“Partial” = 1 Covariate

Page 8: More General Need different response curves for each predictor Need more complex responses

Brown Shrimp in GOM

Data from SeaMap and NOAA

Page 9: More General Need different response curves for each predictor Need more complex responses

Gamma=1.4

Explained Deviance: 59%, AIC=57807 Data from FIA and BioClim

Page 10: More General Need different response curves for each predictor Need more complex responses

Gamma=10

Explained Deviance: 59%, AIC=57961 Data from FIA and BioClim

Page 11: More General Need different response curves for each predictor Need more complex responses

Gamma=20

Explained Deviance: 57%, AIC=58081 Data from FIA and BioClim

Page 12: More General Need different response curves for each predictor Need more complex responses

Gamma=20

Explained Deviance: 51%, AIC=58796 Data from FIA and BioClim

Page 13: More General Need different response curves for each predictor Need more complex responses

Gamma=0.1

Explained Deviance: 59%, AIC=57811 Data from FIA and BioClim

Page 14: More General Need different response curves for each predictor Need more complex responses

GAM Model RunsLayers Gamma Explained

DevianceAIC

All 6 1.4 59 57807

All 6 10 58 57961

All 6 20 57 58081

Best 3 20 51 58796

All 6 0.1 59 57811

Page 15: More General Need different response curves for each predictor Need more complex responses

Best Model?

Best 3 predictors, gamma=20 Data from FIA and BioClim

Page 16: More General Need different response curves for each predictor Need more complex responses

Blue Crab Distribution Model

Page 17: More General Need different response curves for each predictor Need more complex responses

Blue Crab vs. Salinity

Jensen et. al. 2005, Winter distribution of blue crab Callinectes sapidus in Chesapeake Bay: application and cross-validation of a two-stage generalized additive model

Page 18: More General Need different response curves for each predictor Need more complex responses

Response Curves (partial)GAMs

BRTs

Page 19: More General Need different response curves for each predictor Need more complex responses

GAMs vs. BRTs

Martinez-Rincon 2012, Comparative performance of generalized additive models and boosted regression trees for statistical modeling of incidental catch of wahoo (Acanthocybium solandri) in the Mexican tuna purse-seine fishery

“Results indicate little difference between the performance of GAM and BRT models”

Page 20: More General Need different response curves for each predictor Need more complex responses

• = degrees of freedom• Degrees of freedom =

• – number of estimated parameters • gam() chooses smoothing parameters to

minimize:

• Note: The reason the effect of gamma reverses itself at large values is that becomes larger than

∑ ( �̂�−𝑦 𝑖)2

(𝑛−𝑔𝑎𝑚𝑎∗𝑥)2

Gamma in GAMs

Page 21: More General Need different response curves for each predictor Need more complex responses

Anderson

We are not trying to model the data; instead, we are trying to model the information in the data. The goal is to recover the information that applies more generally to the process, not just to the particular data set. If we were merely trying to model the data well, we could fit high order Fourier series terms or polynomial terms until the fit is perfect. Data contain both information and noise; fitting the data perfectly would include modeling the noise and this is counter to our science objective.

Page 22: More General Need different response curves for each predictor Need more complex responses

Additional Resources

• Generalized Additive Models: an introduction with R– Copyrighted book– Includes:

• Linear models• GLMs• GAMs• Examples in R• Some matrix algebra

Page 23: More General Need different response curves for each predictor Need more complex responses

Additional Resources

• Geospatial Analysis with GAMs:– http://www.casact.org/education/annual/

2011/handouts/C3-Guszcza.pdf• Disease mapping using GAMs

(workshop):– http://www.cireeh.org/pmwiki.php/Main/

Gam-mapWorkshop• Mapping population based studies:

– http://www.ij-healthgeographics.com/content/5/1/26