23
Goals, Data, and Challenges Analysis Strategies Hierarchical Modeling The PalEON Perspective Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling Chris Paciorek Department of Statistics University of California, Berkeley www.stat.berkeley.edu/˜paciorek Joint work with Jason McLachlan, Notre Dame Biology, and the PalEON Project (PIs: J. McLachlan, M. Dietze, S. Jackson, C. Paciorek, J. Williams) Chris Paciorek Hierarchical Modeling for Paleoecology 1

Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

  • Upload
    ngominh

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Statistical Inference in Paleoecology,with a Focus on Bayesian Hierarchical

Modeling

Chris PaciorekDepartment of Statistics

University of California, Berkeleywww.stat.berkeley.edu/˜paciorek

Joint work with Jason McLachlan, Notre Dame Biology, and the PalEON

Project (PIs: J. McLachlan, M. Dietze, S. Jackson, C. Paciorek, J.

Williams)

Chris Paciorek Hierarchical Modeling for Paleoecology 1

Page 2: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

(Some) Paleoecological Data Sources

Counts of pollen grains from sediment cores in lakes and otherdepositional environments

Inference: vegetation composition, vegetation types, ecosystemboundaries

Counts of charcoal particles from sediment cores

Inference: fire frequency and severity

Ring widths from tree cores

Inference: growth, biomass and carbon balance

Fire scar data from tree cores

Inference: fire frequency and severity

Chris Paciorek Hierarchical Modeling for Paleoecology 2

Page 3: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

(Some) Goals of Paleoecology

Understand past distributions of vegetation and changes inthose distributions

Use long-term data to understand the nature of vegetationdynamics:

competitionspecies dispersal/spreadspecies declines and causes of those declines – impacts ofdisturbance, disease, herbivory, climatestability of species assemblages

Understand patterns and rates of large-scale disturbance

Chris Paciorek Hierarchical Modeling for Paleoecology 3

Page 4: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Challenges of Paleoecological Data Sources

Sparsity and irregularity in space and time

Certain proxies are only available incertain regionsMany records of limited duration

Lack of replication

Proxies are not direct measurements of thequantities we care about

Calibration data are scarce

Calibration against modern data may be lessrelevant for periods in the past (the no analogproblem)

Many of the quantities of interest do not havepaleodata proxies

Dating is uncertain and dating methods are

expensive

−3000

−2500

−2000

−1500

−1000

−500

0

5 10 15 20

calendar years relative to 1950

pond number

Temporal sampling den-

sity for 23 ponds in cen-

tral New England

Chris Paciorek Hierarchical Modeling for Paleoecology 4

Page 5: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Analysis of Pollen Diagrams

2000

1900

1800

1700

1600

1500

1300

1200

1100

1000

1400

Year A

D

Pine

20

Hemloc

k

20 40

Birch

20

Oak Hickory

20 40

Beech

Chestn

ut

20

Grass &

wee

ds

1500

Charco

al:Poll

en

60

% orga

nic m

atterBerry Pond WestBerry Pond, W Massachusetts

Onset of Little Ice Age

European settlement

Booth et al. (2012) Ecology 93:219

Chris Paciorek Hierarchical Modeling for Paleoecology 5

Page 6: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Pollen in a Spatial Context

690 KERRY D. WOODS AND MARGARET B. DAVIS Ecology, Vol. 70, No. 3

1 C

250 V2 C'

11 12 _ _____ ~13 14

___ 6 ~~~~10

8 12w 14 11 2 \3

6~ ~ . X 517 . i 2196

7 78 > 8 @8~~~~~~~~~~~~~~~~~1

2 Mi~)9 ,,\ X 'AID

FIG. 1. Cuve shoin 4ec olnpretgspotdaantaei hosnso aicro ero eetdsts

2~~~~~~~1 5 0~~~~~~~~~~~~~~~~~~~

6 0~~~~~~~~~~~~~~~0 i 2~6 -

7 _ /50k

8 8 2~_ _

2~~~~~~~~~~~~~~~~~ 5 2

Beech range limit, generalized to township, is shown by the dotted line. ?Q sites from the literature. Curves are shown for all unpublished sites (0) used. Site numbers are given at the bottom of each curve. Vertical scales are in thousands of years before present. Dotted curves (unshaded) are pollen percentages multiplied by 10.

This content downloaded on Thu, 10 Jan 2013 13:15:36 PMAll use subject to JSTOR Terms and Conditions

Woods & Davis (1989); Ecology 70: 681

Chris Paciorek Hierarchical Modeling for Paleoecology 6

Page 7: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Dimension Reduction

Oswald et al. (2007); J of Biogeography 34:900

Chris Paciorek Hierarchical Modeling for Paleoecology 7

Page 8: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Calibrating Pollen to Vegetation

1. Colin Prentice et al.

M

BOREAS 16 (1987)

PICE*

5 Lrn

yll

i

MI

._ -- ' . -._.A

%I.. . , .#/".I.*,

w a I0 a I N

M

PlCEA

50 i m

i h

Fig. 4A. Scatterplots showing relationships between surface pollen percentages and tree volume percentages within various distances of the pollen sampling sites. The tree percentages have been adjusted by multiplying by site factors (equation (3)). This adjustment is designed to eliminate the effects of the percentage constraint. The straight lines fitted to the data have slopes a, and intercepts I, (equation (2)), as estimated by the method of Parsons & Prentice (1981) and Prentice & Webb (1986). The goodness- of-fit to these lines indicates goodness-of-fit to the extended R-value model of the pollen-vegetation relationship.

Prentice (1987); Boreas 16:43

Sugita (2007); The Holocene 17:243

Chris Paciorek Hierarchical Modeling for Paleoecology 8

Page 9: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Ecosystem Boundary Reconstruction

Williams et al. (2009); Global and Planetary Change 66:195

Chris Paciorek Hierarchical Modeling for Paleoecology 9

Page 10: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Fire History Reconstruction

Fire return intervals from

peak detection

DISCUSSION

Interpreting sediment charcoal records

and detecting changes in fire regimes

We introduce three general tools that facilitate the

interpretation of fire history from sediment-charcoal

records. First, the signal-to-noise index provides a semi-

objective way to judge if a record is appropriate for peak

analysis. For example, while .0.8 in most records, SNI

values were consistently,0.5 for the 8000–0 yr BP in the

Xindi Lake record (data not shown), indicating that this

section was not suitable for peak identification. Second,

our use of a Gaussian mixture model to determine

threshold values for peak identification allowed us to

treat all charcoal records with one set of semi-objective

criteria. These criteria are consistent with a mechanistic

FIG. 4. Charcoal records for (a) Ruppert, (b) Xindi, (c) Code, and (d) Wild Tussock lakes. (i) Interpolated charcoalaccumulation rates (CHAR), Cint (black), and background CHAR, Cback (gray); (ii) Peak CHAR, Cpeak, with the values identifyingnoise-related variability (positive and negative gray lines) and peaks identified with each threshold criterion. The 99th percentilecriterion used for interpretation is represented withþ, and the 95th and 99.9th percentile results are represented with gray dots. (iii)Pollen-inferred vegetation zone and peak magnitude for all charcoal accumulation rate (CHAR) values exceeding the positivethreshold value in panels ii. Note peak magnitude values not corresponding toþ symbols in panels ii are those that did not pass theminimum-count screening (see Methods for details), andþ symbols in panels ii with no apparent peak magnitude value correspondto very small peak magnitudes.

PHILIP E. HIGUERA ET AL.210 Ecological Monographs

Vol. 79, No. 2

Local area burned from

background charcoal

Higuera et al. (2011) Ecological Applications 21:3211

Chris Paciorek Hierarchical Modeling for Paleoecology 10

Page 11: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Overview

Hierarchical statistical models build a (possibly complicated)statistical model that relates data to unknown quantities of interestin (relatively simple) stages.

Measurement model: Data are related to a latent process(often a space-time process representing a relevant field)Process model: Latent process is modeled stochastically(potentially with deterministic components) that build inappropriate dependenciesParameter model: Additional ’tuning’ parameters govern thebehavior of the latent process.

The goal is to make inference (including uncertainty assessment)about the the key quantities of interest, which may be the latentprocess or parameters or functionals of those.

Given a model, there are standard (but sometimes inadequate)computational approaches to computing the inferences

Chris Paciorek Hierarchical Modeling for Paleoecology 11

Page 12: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Example: STEPPS Model for Vegetation Reconstruction

time

Latentvegetationprocess

2000 yrs b.p. 1500 yrs b.p. 1000 yrs b.p. Present day300 yrs b.p.

pollen pollen pollen pollen pollen

Pollen data

Vegetation data

Witness trees Forest plots

Chris Paciorek Hierarchical Modeling for Paleoecology 12

Page 13: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Calibration Data

Township witness tree data Pollen sediment samplesbeechbirchchestnuthemlockhickory

mapleoakpinespruceother

183 towns, 26-3149 trees per town 23 ponds, 500 grains per pond

Chris Paciorek Hierarchical Modeling for Paleoecology 13

Page 14: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

A Cartoon of the Model

vegetation composition process

vegetation data pollen data

heterogeneity par'm

pollen likelihoodvegetation likelihood

Estimation phase (veg'n and pollen)

vegetation data pollen data

pollen likelihoodvegetation likelihood

Prediction phase (pollen only)

fixed parameters drawn from estimation run  posterior

variance components

scaling and dispersal par'ms

scaling and dispersal par'ms

vegetation composition process

smoothing par'ms

smoothing par'ms covariatescovariates

heterogeneity par'm

variance components

Chris Paciorek Hierarchical Modeling for Paleoecology 14

Page 15: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Calibration of Pollen to Vegetation

0.0 0.2 0.4

0.0

0.2

0.4

veg'n (predicted pollen)

raw

pol

len

●●●

●●

●●

●●

●●

●●

beech

0.00 0.15 0.300.

000.

150.

30

veg'n (predicted pollen)

raw

pol

len

● ●

●●●

●●

birch

0.00 0.15 0.30

0.00

0.15

0.30

veg'n (predicted pollen)

raw

pol

len

●●●

●●

●●

hemlock

0.00 0.04 0.08

0.00

0.04

0.08

veg'n (predicted pollen)

raw

pol

len

●●●

●●

●●●

● ●●

●●●

hickory

0.00 0.10 0.20

0.00

0.10

0.20

veg'n (predicted pollen)

raw

pol

len

●●

●●●●●●●●

●●

●●

● ●●●●●

maple

0.0 0.3 0.6

0.0

0.3

0.6

veg'n (predicted pollen)

raw

pol

len

●●

●●

oak

0.00 0.15 0.30

0.00

0.15

0.30

veg'n (predicted pollen)

raw

pol

len

●●●

●●

pine

0.00 0.03

0.00

0.03

veg'n (predicted pollen)

raw

pol

len

●●

●●

●● ●

spruce

+ = raw pollen vs. spatially-smoothed vegetation• = raw pollen vs. model-predicted pollen based on vegetation, accounting forspecies-specific pollen production and for long-distance pollen dispersal

Chris Paciorek Hierarchical Modeling for Paleoecology 15

Page 16: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Inference: Time

0.0 0.3

−25

00−

2000

−15

00−

1000

−50

00

beech

0.0 0.3

birch

0.00 0.15

chestnut

0.0 0.4

hemlock

0.00 0.15

hickory

0.0 0.4

maple

0.00 0.15

oak

0.00 0.25

pine

0.00 0.15

spruce

0.00 0.25

other

−−− = raw pollen proportions− • − = model-estimated vegetation proportions−−−= uncertainty estimates

Chris Paciorek Hierarchical Modeling for Paleoecology 16

Page 17: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Inference: Space

−40

0

chestnut hemlock oak composition

−11

00−

1800

We can also present spatial predictions in the context ofuncertainty, in particular assessing our confidence in changes overtime and differences across space.

Chris Paciorek Hierarchical Modeling for Paleoecology 17

Page 18: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Inference: model parameters

Pollen scaling Long-distance dispersal range

01

23

45

φφ

● ●

beec

h

birc

h

ches

tnut

hem

lock

hick

ory

map

le

oak

pine

spru

ce

othe

r

●●

0.0

1.0

2.0

3.0

ψψ

● ● ● ● ● ●● ● ●

beec

h

birc

h

ches

tnut

hem

lock

hick

ory

map

le

oak

pine

spru

ce

othe

r

●● ● ● ● ●

●●

● ●

red = colonial estimatesblack = modern estimates

Chris Paciorek Hierarchical Modeling for Paleoecology 18

Page 19: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Key Aspects of Hierarchical Approach

Sparsity and irregularity in space and time: Borrow strength and smoothin space & time

Certain proxies are only available in certain regions

Many records of limited duration

Lack of replication: Smoothing in space accounts for a form of replication

Proxies are not direct measurements of the quantities we care about:Calibrate to direct measurements

Calibration data are scarce

Calibration against modern data may be less relevant for periods in thepast (the no analog problem)

Many of the quantities of interest do not have paleodata proxies

Dating is uncertain and dating methods are expensive: Include datinguncertainty in statistical model, e.g., the BACON model (Blaauw &Christen 2011)

Chris Paciorek Hierarchical Modeling for Paleoecology 19

Page 20: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Open Issues

Data sparsity

Data dropout tends to have large effects – e.g., losing anobservation far from other observations can cause largechanges in predictions

To what extent can we interpret parameter estimates asphysically meaningful?

Computation can be difficult

Chris Paciorek Hierarchical Modeling for Paleoecology 20

Page 21: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

PalEON: A PaleoEcological Observatory Network

Multi-institution collaboration of paleoecologists, statisticians,and ecosystem modelers

Overarching goal: Use paleodata to help understand globalchangeCurrent focus on northeastern/midwestern US over the past3000 years

Chris Paciorek Hierarchical Modeling for Paleoecology 21

Page 22: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

Motivation for PalEON

Paleoecological data have not been used extensively inconsidering global change, even though they are the only dataon long-term changes

Proxies are often not directly related to quantities of interestfor global change and are not in a form directly useful forquantitative analysis

Terrestrial ecosystem models and paleodata are at differentspatial and temporal scales

Chris Paciorek Hierarchical Modeling for Paleoecology 22

Page 23: Statistical Inference in Paleoecology, with a Focus on ...paciorek/presentations/paciorek-d... · Statistical Inference in Paleoecology, with a Focus on Bayesian Hierarchical Modeling

Goals, Data, and ChallengesAnalysis Strategies

Hierarchical ModelingThe PalEON Perspective

PalEON Goals

Develop networks of paleodata, synthesized statistically, to informecosystem models:

Assess models against paleodata

Initialize models based on paleodata

Assimilate paleodata into models

Improve model formulations

Prioritize new data collection

Chris Paciorek Hierarchical Modeling for Paleoecology 23