Pan-STARRS Photometric Classification of Supernovae using a Hierarchical Bayesian Model George Miller, Edo Berger, Nathan Sanders Harvard-Smithsonian Center

Pan-STARRS Photometric Classification of Supernovae using a Hierarchical Bayesian

Model

George Miller, Edo Berger, Nathan SandersHarvard-Smithsonian Center for Astrophysics

• 1.8m f/4.4 telescope with 3.2 degree FOV and 1.6 Gpix camera

• PS1 Sky Surveys: May 2010 – March 2014 • Medium Deep 10 fields of 7 sq. deg. with a

3-day staggered cadence and grizy filters• Full reprocessing MD.PV3 underway with

full forced photometry• Currently the best precursor and source of

training data for LSST classification tools– Similar cadence (~3 days) and sensitivity– Similar grizy band-passes (+u for LSST)– Wide variety of science drivers and a large

distributed collaboration

PS1 Medium Deep Survey

– MISSION: To develop a joint probabilistic model for the populations of different classes of transients and subsequent observations of individual objects. Allow this model to be fit using PS1 classified (~20%) and unclassified (~80%) data and then be applied to achieve classification and parameter inference on new objects, e.g. from LSST.

Implementation

– MISSION: To develop a joint probabilistic model for the populations of different classes of transients and subsequent observations of individual objects. Allow this model to be fit using PS1 classified (~20%) and unclassified (~80%) data and then be applied to achieve classification and parameter inference on new objects, e.g. from LSST.

– COMPUTATIONAL METHODS: STAN

Implementation

• Stan is a probabilistic programming language, written in C++ with python and R interfaces.

• Uses a “no U-turn” (NUTS) Hamiltonian Monte Carlo (HMC) sampler, which adaptively refines the Metropolis step size by the transversal of the posterior.

• Incorporates algorithms for automatic differentiation and adaptive tuning parameter refinement

PS1/MDS Spectroscopic Sample

• Sample of 514 events with spectroscopic follow up (~150 nights on MMT, Magellan, Gemini by the Harvard PS1 team)

• 69% Ia 22% IIP/n 4% Ib/c 4% ULSN, TDE, etc.1% Unclassified

AhF

[NCL,NF]

t1

[NCL,NF,NSN]β

[NCL,NF,NSN]τrise

[NCL,NF,NSN]τfall

[NCL,NF,NSN]

M[N]

t0

[NCL,NF,NSN]

thSNF

[NCL,NF,NSN,2]thF

[NCL,NF,2]τhSNF

[NCL,NF,NSN,2]τhF

[NCL,NF,2]

βhSNF

[NCL,NF,NSN]βhF

[NCL,NF]

th

[NCL,2]

λ[NSN,2] τh

[NCL,2]Βh

[NCL]

c[NCL,NSN]

A[NCL,NF,NSN]

σ[NCL,NF,NSN]

HG[NCL,NF,NSN]

AhSNF

[NCL,NF,NSN]chSNF

[NCL,NF,NSN]

HGhSNF

[NCL,NF,NSN]σhF

[NCL,NF]σhSNF

[NCL,NF,NSN]

chAh

[NCL]HGh

[NCL]

σh

[NCL]

Flux[N]

λ[NSN]

λ[NSN,2]

λ[NSN]

λ[NSN]

λ[NSN]

• Adapted from the ‘non-Ia’ lightcurve model (Bazin+ 2009)• Required a functional form sufficiently generic to fit a wide array of possible SN

lightcurves• Added second time parameter and linear decay function to allow for plateau

effect near and after maximum light

Confirmed Type IIP

Confirmed Type Ia

Individual fits using PyMC MH

• Adapted from Bazin+ 2009 ‘non-Ia’ lightcurve model• Required a functional form sufficiently generic to fit a wide array of possible SN

lightcurves• Added second time parameter and linear decay function to allow for plateau

effect near and after maximum light• Additional normal error term to encapuslate intrinsic scatter in flux and allow for

overdispersion• Additional terms for host galaxy information

AhF

[NCL,NF]

t1

[NCL,NF,NSN]β

[NCL,NF,NSN]τrise

[NCL,NF,NSN]τfall

[NCL,NF,NSN]

M[N]

t0

[NCL,NF,NSN]

thSNF

[NCL,NF,NSN,2]thF

[NCL,NF,2]τhSNF

[NCL,NF,NSN,2]τhF

[NCL,NF,2]

βhSNF

[NCL,NF,NSN]βhF

[NCL,NF]

th

[NCL,2]

λ[NSN,2] τh

[NCL,2]Βh

[NCL]

c[NCL,NSN]

A[NCL,NF,NSN]

σ[NCL,NF,NSN]

HG[NCL,NF,NSN]

AhSNF

[NCL,NF,NSN]chSNF

[NCL,NF,NSN]

HGhSNF

[NCL,NF,NSN]σhF

[NCL,NF]σhSNF

[NCL,NF,NSN]

chAh

[NCL]HGh

[NCL]

σh

[NCL]

Flux[N]

λ[NSN]

λ[NSN,2]

λ[NSN]

λ[NSN]

λ[NSN]

Hierarchical Bayes

Probabilistic Connection

yi

[N]

θi

[N]

φ

• Model in which parameter dependences can be constructed on multiple structured levels.

• A common example: The one-way normal model

θ = (θi)φ = (μ, τ)

Hierarchical Bayes


y[N]

φ

• Model in which parameter dependences can be constructed on multiple structured levels.

• A common example: The one-way normal model

φ = (μ, τ)• Non-centered parameterizations exchanges certain

dependences between hyperparameters with correlations between hyperparameters and the data

ν[N]

Linear Connection

Hierarchical Bayes


Deterministic Connection

τrise hF

[NF]

τrise h

τrise [NF,NSN]

M[N]

• Three level model, with a top level hyperparameters for each parameter, and 4 mid level hyperparameters for each filter

• Non-centered parameterization to remove correlations between hyperparameters

Hierarchical Bayes


Deterministic Connection

τrise hF

[NF]

τrise h

τrise [NF,NSN]

M[N]

τrise hSNF

[NF,NSN]• Three level model, with a top level hyperparameters for each parameter, and 5 mid level hyperparameters for each filter

• Non-centered parameterization to remove correlations between hyperparameters

• Adopt normal hyperprior for all location (mean) hyperparameters and half-cauchy distribution for all scale (variance) hyperparameters

• May then feed point estimates and test quantities into other Machine Learning classification algorithms

Ia vs. non-IaAdaBoost classification

Without redshift information With redshift information

AhF

[NCL,NF]

t1

[NCL,NF,NSN]β

[NCL,NF,NSN]τrise

[NCL,NF,NSN]τfall

[NCL,NF,NSN]

M[N]

t0

[NCL,NF,NSN]

thSNF

[NCL,NF,NSN,2]thF

[NCL,NF,2]τhSNF

[NCL,NF,NSN,2]τhF

[NCL,NF,2]

βhSNF

[NCL,NF,NSN]βhF

[NCL,NF]

th

[NCL,2]

λ[NSN,2] τh

[NCL,2]Βh

[NCL]

c[NCL,NSN]

A[NCL,NF,NSN]

σ[NCL,NF,NSN]

HG[NCL,NF,NSN]

AhSNF

[NCL,NF,NSN]chSNF

[NCL,NF,NSN]

HGhSNF

[NCL,NF,NSN]σhF

[NCL,NF]σhSNF

[NCL,NF,NSN]

chAh

[NCL]HGh

[NCL]

σh

[NCL]

Flux[N]

λ[NSN]

λ[NSN,2]

λ[NSN]

λ[NSN]

λ[NSN]

Categorical Mixture Model


Linear Connection

• Top level multinomial simplex parameter controlling the classes of supernovae

τrise hF

[NCL,NF]

τrise h

[NCL]

τrise [NCL,NF,NSN]

τrise hSNF

[NCL,NF,NSN]

λ[NSN]

• Produces full posterior probability of classification for each event observed.

• Will likely need to set more informative priors on hyperparameters to allow for convergence across the entire simplex space.

Online predictions and computational scale


Linear Connection

• Run model on full database and save marginal posteriors for hyperparameters

• Use these distributions as priors for new analysis of unclassified events

• Better method would use classified samples as auxiliary data and construct the hierarchical model using both tagged and untagged data• This may not be computationally feasible on

timescales needed for follow-up observations

• Can fix certain hyperparameters to reduce computational time i.e. for well understood classes such as Ia

τrise hF

[NCL,NF]

τrise h

[NCL]

τrise [NCL,NF,NSN]

τrise hSNF

[NCL,NF,NSN]

λ[NSN]

To Do

• Incorporate photo-z information rather than spectroscopic redshifts

• Model K-corrections as a function of photo-z and SN type

• Include host-galaxy information (color, positional offset, etc.)

• Adapt functional light curve model to be more sensitive to SN rise time and shape• Better constrain early

online predictions • Explore Riemannian manifold

HMC techniques

Summary

• Developed joint probabilistic Bayesian Hierarchical model to be fit with PS1 MD classified SN sample

• Coded model using the HMC probabilistic program STAN to effectively explore the high dimensional space

• Classification may be accomplished using separate predictive model of fitted parameters, or by developing a full mixture model with simplex parameters

• Stay tuned for first results!

Documents

Pan-STARRS Photometric Classification of Supernovae using a Hierarchical Bayesian Model George Miller, Edo Berger, Nathan Sanders Harvard-Smithsonian Center