Upload
merryl-atkins
View
218
Download
1
Embed Size (px)
Citation preview
Pan-STARRS Photometric Classification of Supernovae using a Hierarchical Bayesian
Model
George Miller, Edo Berger, Nathan SandersHarvard-Smithsonian Center for Astrophysics
• 1.8m f/4.4 telescope with 3.2 degree FOV and 1.6 Gpix camera
• PS1 Sky Surveys: May 2010 – March 2014 • Medium Deep 10 fields of 7 sq. deg. with a
3-day staggered cadence and grizy filters• Full reprocessing MD.PV3 underway with
full forced photometry• Currently the best precursor and source of
training data for LSST classification tools– Similar cadence (~3 days) and sensitivity– Similar grizy band-passes (+u for LSST)– Wide variety of science drivers and a large
distributed collaboration
PS1 Medium Deep Survey
– MISSION: To develop a joint probabilistic model for the populations of different classes of transients and subsequent observations of individual objects. Allow this model to be fit using PS1 classified (~20%) and unclassified (~80%) data and then be applied to achieve classification and parameter inference on new objects, e.g. from LSST.
Implementation
– MISSION: To develop a joint probabilistic model for the populations of different classes of transients and subsequent observations of individual objects. Allow this model to be fit using PS1 classified (~20%) and unclassified (~80%) data and then be applied to achieve classification and parameter inference on new objects, e.g. from LSST.
– COMPUTATIONAL METHODS: STAN
Implementation
• Stan is a probabilistic programming language, written in C++ with python and R interfaces.
• Uses a “no U-turn” (NUTS) Hamiltonian Monte Carlo (HMC) sampler, which adaptively refines the Metropolis step size by the transversal of the posterior.
• Incorporates algorithms for automatic differentiation and adaptive tuning parameter refinement
PS1/MDS Spectroscopic Sample
• Sample of 514 events with spectroscopic follow up (~150 nights on MMT, Magellan, Gemini by the Harvard PS1 team)
• 69% Ia 22% IIP/n 4% Ib/c 4% ULSN, TDE, etc.1% Unclassified
AhF
[NCL,NF]
t1
[NCL,NF,NSN]β
[NCL,NF,NSN]τrise
[NCL,NF,NSN]τfall
[NCL,NF,NSN]
M[N]
t0
[NCL,NF,NSN]
thSNF
[NCL,NF,NSN,2]thF
[NCL,NF,2]τhSNF
[NCL,NF,NSN,2]τhF
[NCL,NF,2]
βhSNF
[NCL,NF,NSN]βhF
[NCL,NF]
th
[NCL,2]
λ[NSN,2] τh
[NCL,2]Βh
[NCL]
c[NCL,NSN]
A[NCL,NF,NSN]
σ[NCL,NF,NSN]
HG[NCL,NF,NSN]
AhSNF
[NCL,NF,NSN]chSNF
[NCL,NF,NSN]
HGhSNF
[NCL,NF,NSN]σhF
[NCL,NF]σhSNF
[NCL,NF,NSN]
chAh
[NCL]HGh
[NCL]
σh
[NCL]
Flux[N]
λ[NSN]
λ[NSN,2]
λ[NSN]
λ[NSN]
λ[NSN]
• Adapted from the ‘non-Ia’ lightcurve model (Bazin+ 2009)• Required a functional form sufficiently generic to fit a wide array of possible SN
lightcurves• Added second time parameter and linear decay function to allow for plateau
effect near and after maximum light
Confirmed Type IIP
Confirmed Type Ia
Individual fits using PyMC MH
• Adapted from Bazin+ 2009 ‘non-Ia’ lightcurve model• Required a functional form sufficiently generic to fit a wide array of possible SN
lightcurves• Added second time parameter and linear decay function to allow for plateau
effect near and after maximum light• Additional normal error term to encapuslate intrinsic scatter in flux and allow for
overdispersion• Additional terms for host galaxy information
AhF
[NCL,NF]
t1
[NCL,NF,NSN]β
[NCL,NF,NSN]τrise
[NCL,NF,NSN]τfall
[NCL,NF,NSN]
M[N]
t0
[NCL,NF,NSN]
thSNF
[NCL,NF,NSN,2]thF
[NCL,NF,2]τhSNF
[NCL,NF,NSN,2]τhF
[NCL,NF,2]
βhSNF
[NCL,NF,NSN]βhF
[NCL,NF]
th
[NCL,2]
λ[NSN,2] τh
[NCL,2]Βh
[NCL]
c[NCL,NSN]
A[NCL,NF,NSN]
σ[NCL,NF,NSN]
HG[NCL,NF,NSN]
AhSNF
[NCL,NF,NSN]chSNF
[NCL,NF,NSN]
HGhSNF
[NCL,NF,NSN]σhF
[NCL,NF]σhSNF
[NCL,NF,NSN]
chAh
[NCL]HGh
[NCL]
σh
[NCL]
Flux[N]
λ[NSN]
λ[NSN,2]
λ[NSN]
λ[NSN]
λ[NSN]
Hierarchical Bayes
Probabilistic Connection
yi
[N]
θi
[N]
φ
• Model in which parameter dependences can be constructed on multiple structured levels.
• A common example: The one-way normal model
θ = (θi)φ = (μ, τ)
Hierarchical Bayes
Probabilistic Connection
y[N]
φ
• Model in which parameter dependences can be constructed on multiple structured levels.
• A common example: The one-way normal model
φ = (μ, τ)• Non-centered parameterizations exchanges certain
dependences between hyperparameters with correlations between hyperparameters and the data
ν[N]
Linear Connection
Hierarchical Bayes
Probabilistic Connection
Deterministic Connection
τrise hF
[NF]
τrise h
τrise [NF,NSN]
M[N]
• Three level model, with a top level hyperparameters for each parameter, and 4 mid level hyperparameters for each filter
• Non-centered parameterization to remove correlations between hyperparameters
Hierarchical Bayes
Probabilistic Connection
Deterministic Connection
τrise hF
[NF]
τrise h
τrise [NF,NSN]
M[N]
τrise hSNF
[NF,NSN]• Three level model, with a top level hyperparameters for each parameter, and 5 mid level hyperparameters for each filter
• Non-centered parameterization to remove correlations between hyperparameters
• Adopt normal hyperprior for all location (mean) hyperparameters and half-cauchy distribution for all scale (variance) hyperparameters
• May then feed point estimates and test quantities into other Machine Learning classification algorithms
Ia vs. non-IaAdaBoost classification
Without redshift information With redshift information
AhF
[NCL,NF]
t1
[NCL,NF,NSN]β
[NCL,NF,NSN]τrise
[NCL,NF,NSN]τfall
[NCL,NF,NSN]
M[N]
t0
[NCL,NF,NSN]
thSNF
[NCL,NF,NSN,2]thF
[NCL,NF,2]τhSNF
[NCL,NF,NSN,2]τhF
[NCL,NF,2]
βhSNF
[NCL,NF,NSN]βhF
[NCL,NF]
th
[NCL,2]
λ[NSN,2] τh
[NCL,2]Βh
[NCL]
c[NCL,NSN]
A[NCL,NF,NSN]
σ[NCL,NF,NSN]
HG[NCL,NF,NSN]
AhSNF
[NCL,NF,NSN]chSNF
[NCL,NF,NSN]
HGhSNF
[NCL,NF,NSN]σhF
[NCL,NF]σhSNF
[NCL,NF,NSN]
chAh
[NCL]HGh
[NCL]
σh
[NCL]
Flux[N]
λ[NSN]
λ[NSN,2]
λ[NSN]
λ[NSN]
λ[NSN]
Categorical Mixture Model
Probabilistic Connection
Linear Connection
• Top level multinomial simplex parameter controlling the classes of supernovae
τrise hF
[NCL,NF]
τrise h
[NCL]
τrise [NCL,NF,NSN]
τrise hSNF
[NCL,NF,NSN]
λ[NSN]
• Produces full posterior probability of classification for each event observed.
• Will likely need to set more informative priors on hyperparameters to allow for convergence across the entire simplex space.
Online predictions and computational scale
Probabilistic Connection
Linear Connection
• Run model on full database and save marginal posteriors for hyperparameters
• Use these distributions as priors for new analysis of unclassified events
• Better method would use classified samples as auxiliary data and construct the hierarchical model using both tagged and untagged data• This may not be computationally feasible on
timescales needed for follow-up observations
• Can fix certain hyperparameters to reduce computational time i.e. for well understood classes such as Ia
τrise hF
[NCL,NF]
τrise h
[NCL]
τrise [NCL,NF,NSN]
τrise hSNF
[NCL,NF,NSN]
λ[NSN]
To Do
• Incorporate photo-z information rather than spectroscopic redshifts
• Model K-corrections as a function of photo-z and SN type
• Include host-galaxy information (color, positional offset, etc.)
• Adapt functional light curve model to be more sensitive to SN rise time and shape• Better constrain early
online predictions • Explore Riemannian manifold
HMC techniques
Summary
• Developed joint probabilistic Bayesian Hierarchical model to be fit with PS1 MD classified SN sample
• Coded model using the HMC probabilistic program STAN to effectively explore the high dimensional space
• Classification may be accomplished using separate predictive model of fitted parameters, or by developing a full mixture model with simplex parameters
• Stay tuned for first results!