Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Sandia National Laboratories is a multimis-sion laboratory managed and operated byNational Technology and Engineering So-lutions of Sandia, LLC., a wholly owned

subsidiary of Honeywell International, Inc.,for the U.S. Department of Energy’s National

Nuclear Security Administration under contractDE-NA-0003525. SAND NO. 2019-3409 C

Using Bayesian Neural Networks for UQof Hyperspectral Image Target Detection

AUTHORS

Daniel Ries, Braden Smith,Josh Zollweg,Lauren Hund

Outline

1. What is a neural network?2. What is a Bayesian neural network (BNN)?3. Estimating BNNs with variational inference4. Hyperspectral images and Megascene5. BNN on Megascene6. Concluding remarks

What is a neural network?

What is a neural network?

Logistic Regression as a Neural Network

Suppose Y1, ...,Yn are binary random variables, and x1, ...,xn arep-length vectors

If want to estimate P(Yi = 1|xi), we could use logistic regression:

Yiiid∼ Bernoulli(πi)

log(

πi1− πi

)= x′

iβ

Logistic Regression as a Neural Network

𝑥2

.

.

.

𝑥1

𝑥𝑝

𝑃 𝑌 = 1 𝒙𝑌

𝑖=1

𝑝

𝑥𝑖𝑤𝑖 + 𝑏

= 𝑧

𝑔(𝜋)𝑓 𝑧 = 𝜋

= 𝜋

A Simple Neural Network

𝑥2

.

.

.

𝑥1

𝑥𝑝


𝑖=1

𝑝

𝑥𝑖𝑤1𝑖 + 𝑏1

= 𝑧11

𝑖=1

𝑝


= 𝑧12

𝑖=1

2

𝑣1𝑖𝑤3𝑖 + 𝑏3

= 𝑧21

𝑓2 𝑧21 = 𝜋 𝑔(𝜋)

𝑓1 𝑧11 = 𝑣11

𝑓1 𝑧12 = 𝑣12= 𝜋

More Complicated Neural Network

𝑥2

.

.

.

𝑥1

𝑥𝑝

𝑃(𝑌 = 1|𝒙) 𝑌

𝑖=1

𝑝


= 𝑧11

𝑖=1

𝑝


= 𝑧12

𝑖=1

𝑝


= 𝑧13

𝑖=1

𝑝

𝑥𝑖𝑤1𝐾1𝑖 + 𝑏1𝐾1

= 𝑧1𝐾1

𝑖=1

𝐾1

𝑣1𝑖𝑤21𝑖 + 𝑏21

= 𝑧21

𝑖=1

𝐾1

𝑣1𝑖𝑤22𝑖 + 𝑏22

= 𝑧22

𝑖=1

𝐾1

𝑣1𝑖𝑤23𝑖 + 𝑏23

= 𝑧23

𝑖=1

𝐾1

𝑣1𝑖𝑤2𝐾2𝑖 + 𝑏2𝐾2

= 𝑧2𝐾2

𝑖=1

𝐾2

𝑣𝐿𝑖𝑤𝐿𝑖 + 𝑏𝐿

= 𝑧𝐿

.

.

.

.

.

.

𝑓 𝑧𝐿

𝑓1 𝑧1𝑖= 𝑣1𝑖

…

𝑓𝐿 𝑧𝐿𝑖= 𝑣𝐿𝑖

= 𝜋

𝑔(𝜋)

What is a Bayesian Neural Network?

For those of you who “do machine learning, not statistics”:BNNs are simply NNs that have an implied loss function (youdon’t have to choose and you get UQ for free!)

For those of you who “do statistics, not machine learning”:BNNs are simply a highly nonlinear model where you put a prioron your parameters (and a Bernoulli likelihood for our binarydata)

For those of you who “do both”:You’re not fooling anyone, I know you associate with one or theother.

For those of you who “do neither”:You are already “resting your eyes.”





















An Example BNN

Yi ∼ Bernoulli(πi)

πi = f2(z21)

= f2

2∑j=1

f1

( p∑k=1

xkwjk + bj

)w2j + b3

wjk

iid∼ N (0, 1), ∀j, k

bliid∼ N (0, 1), l = 1, 2, 3

𝑥2

.

.

.

𝑥1

𝑥𝑝


𝑖=1

𝑝


= 𝑧11

𝑖=1

𝑝


= 𝑧12

𝑖=1

2

𝑣1𝑖𝑤3𝑖 + 𝑏3

= 𝑧21

𝑓2 𝑧21 = 𝜋 𝑔(𝜋)

𝑓1 𝑧11 = 𝑣11

𝑓1 𝑧12 = 𝑣12= 𝜋

Estimating (or Training) BNNs

Bayesians love to sample their posterior distributions via MCMC

But this can be slowwww

Enter Variational Inference:• Instead approximate posterior with q(θ) by solving:

argminq∗

KL(q∗(θ)||p(w,b|y))

= argminq∗

KL(q∗(θ)||p(w,b, y))

• Put restrictions on set of q∗

• Result is a loss function we can do gradient descent as usualon


Bayesians love to sample their posterior distributions via MCMCBut this can be slowwww


argminq∗


= argminq∗






Enter Variational Inference:

• Instead approximate posterior with q(θ) by solving:

argminq∗


= argminq∗







argminq∗


= argminq∗







argminq∗


= argminq∗







argminq∗


= argminq∗







argminq∗


= argminq∗




Hyperspectral Images (HSI)

• Airborne spectrometers construct a (x, y, z) tensor• x and y describe the spatial dimension• z describes the spectra at a single pixel (x, y)

• HSI Target detection is where specific materials are identifiedby their reflectance

• The target object might be smaller than the projection of apixel on the ground

Megascene

• Simulated HSI scene using DIRSIG model• Targets inserted using paint reflectanceand bi-directional reflectancedistribution function data from NEFDS

• 3 environments, and 3 times of day for 9total scenes

• 125 targets placed randomly throughouteach scene

• Targets range from large to subpixel(radius of 0.1m to 4m), size distributednearly uniform

• Some targets only partially visible

BNN on Megascene

• Architecture: Input-10-5-2-Output• Activations: Hyperbolic tan between hidden layers, inverselogit for output

• Priors: Standard normal on all weights and biases• Optimizer: Stochastic gradient descent with momentum• Training Data: Left half of mid latitude summer environmentat 1200

• Test Data: Right half of all environments at all times• Software: Edward and Tensorflow

BNN on Megascene

• Architecture: Input-10-5-2-Output

• Activations: Hyperbolic tan between hidden layers, inverselogit for output



BNN on Megascene




BNN on Megascene


• Priors: Standard normal on all weights and biases

• Optimizer: Stochastic gradient descent with momentum• Training Data: Left half of mid latitude summer environmentat 1200


BNN on Megascene


• Priors: Standard normal on all weights and biases• Optimizer: Stochastic gradient descent with momentum

• Training Data: Left half of mid latitude summer environmentat 1200


BNN on Megascene




BNN on Megascene



• Test Data: Right half of all environments at all times

• Software: Edward and Tensorflow

BNN on Megascene




Test Set Performance

Full test set. High confidence set.

High confidence set: 80% credible interval forπ ∈ (0, 0.2) ∪ (0.8, 1)


Moving Forward

There is still a lot of work to be done on BNNs

• Explore deeper and wider, more complicated (convolutional)architectures

• Full tuning of hyperparameters• Explore effects of prior• Assess coverage of credible intervals

Thank you for listening!

[email protected]


BNN

Stan-dardNN

Full test set High confidence set


Abundance vs Probability

Documents

Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned