Applied Bayesian Inference with PyMC

@MrSantoni

Which color will sell more?

Page A

A Tea Pot

Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.

Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.

Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.

Page B

A Tea Pot

Page A

A Tea Pot

Page B

A Tea Pot

#buy / N #buy / N

• What if N is small?• What is N to have 90% confidence?• What if N is different on A and B?

Bayesian Inference

Probability:

Claim: we think Bayesian

FrequentistBayesian

FrequenceBelief

test 1 test 2 test 3

Claim: we think Bayesian

no-bugs confidence

Bayesian Inference =

update your beliefs

new evidence

prior belief

The Developer View

Statistical Problem

def frequentist(): return 80%

def bayesian(): return0% 100%

How to?

0% 100%

How to?

𝑃 ( 𝐴|𝐵 )=𝑃 (𝐵|𝐴 )𝑃 (𝐴)

𝑃 (𝐵)

Closed-form solution:

Realistic Cases

Toy Examples

0% 100%

• Perform Bayesian Inference• Markov Chain Monte Carlo techniques• A.k.a. Probabilistic Programming

Show me the code!

Example A/B test

Only one difference between A and B

Page A

A Tea Pot

Page B

A Tea Pot

Page A

A Tea Pot

Page B

A Tea Pot

Assume there isp_aprobability of clicking BUY when landing on Ap_bprobability of clicking BUY when landing on B

How to compute p_a and p_b?

Page A– N_a visitors– C_a BUY-click on page A

Page B– N_b visitors– C_b BUY-click on page B

Frequentist:C_a / N_a

BUT:Observed frequency does not necessarily equal p_a

Bayesian:Infer true frequency from observed data

Page A

A Tea Pot

Bayesian Worflow

1. Define prior2. Fit to observations3. Get posteriors

from pymc import Uniform, rbernoulli, Bernoulli, MCMCfrom matplotlib import pyplot as plt

p_A_true = 0.05N = 1500occurrences = rbernoulli(p_A_true, N)

print 'Click-BUY:'print occurrences.sum()print 'Observed frequency:'print occurrences.sum() / float(N)

Click-BUY:68Observed frequency:0.0453333333333

Clicking BUY

Bernoulli distribution

𝑃 (𝑐𝑙𝑖𝑐𝑘 )={ 𝑝1−𝑝

𝑐𝑙𝑖𝑐𝑘=1𝑐𝑙𝑖𝑐𝑘=0

click=1 click=00

0.10.20.30.40.50.60.70.8

p_A = Uniform('p_A', lower=0, upper=1)0 1 P_a

print p_A.random()print p_A.value

array(0.906086144982998)array(0.906086144982998)

print p_A.random()print p_A.value

array(0.285313846133313)array(0.285313846133313)

p_A = Uniform('p_A', lower=0, upper=1)

obs = Bernoulli('obs', p_A, value=occurrences, observed=True)

[------- 20% ] 4053 of 20000 complete in 0.5 sec[------------- 36% ] 7315 of 20000 complete in 1.0 sec[-----------------53% ] 10627 of 20000 complete in 1.5 sec[-----------------69%------ ] 13939 of 20000 complete in 2.0 sec[-----------------81%----------- ] 16376 of 20000 complete in 2.5 sec[-----------------96%---------------- ] 19342 of 20000 complete in 3.0 sec[-----------------100%-----------------] 20000 of 20000 complete in 3.1 sec[ 0.04656576 0.04656576 0.04656576 ..., 0.03803667 0.03803667 0.03803667]

mcmc = MCMC([p_A, obs])mcmc.sample(20000, 1000)

print mcmc.trace('p_A')[:]

plt.figure(figsize=(8, 7))plt.hist(mcmc.trace('p_A')[:], bins=35, histtype='stepfilled', normed=True)plt.xlabel('Probability of clicking BUY')plt.ylabel('Density')plt.vlines(p_A_true, 0, 90, linestyle='--', label='True p_A')plt.legend()plt.savefig('p_A_hist_N_%s.png' % N)plt.show()

Confidence 90% that P is between X and Y?

There is 90% probability that p_A is between 0.0373019596856 and 0.0548052806892

p_A_samples = mcmc.trace('p_A')[:]lower_bound = np.percentile(p_A_samples, 5)upper_bound = np.percentile(p_A_samples, 95)

print 'There is 90%% probability that p_A is between %s and %s' % (lower_bound, upper_bound)

What if N_a is lower?

from pymc import Uniform, rbernoulli, Bernoulli, MCMCfrom matplotlib import pyplot as plt

p_A_true = 0.05N = 50occurrences = rbernoulli(p_A_true, N)

print 'Click-BUY:'print occurrences.sum()print 'Observed frequency:'print occurrences.sum() / float(N)

Click-BUY:2Observed frequency:0.04

mcmc = MCMC([p_A, obs])mcmc.sample(20000, 1000)

print mcmc.trace('p_A')[:]

[----- 14% ] 2874 of 20000 complete in 0.5 sec[----------- 30% ] 6035 of 20000 complete in 1.0 sec[-----------------47% ] 9440 of 20000 complete in 1.5 sec[-----------------63%---- ] 12775 of 20000 complete in 2.0 sec[-----------------81%---------- ] 16203 of 20000 complete in 2.5 sec[-----------------100%-----------------] 20000 of 20000 complete in 3.0 sec[ 0.06240723 0.06240723 0.06240723 ..., 0.01864419 0.01864419 0.01864419]

plt.figure(figsize=(8, 7))plt.hist(mcmc.trace('p_A')[:], bins=35, histtype='stepfilled', normed=True)plt.xlabel('Probability of clicking BUY')plt.ylabel('Density')plt.vlines(p_A_true, 0, 90, linestyle='--', label='True p_A')plt.legend()plt.savefig('p_A_hist_N_%s.png' % N)plt.show()

Confidence 90% that P is between X and Y?

There is 90% probability that p_A is between 0.0160966147705 and 0.114655284797

p_A_samples = mcmc.trace('p_A')[:]lower_bound = np.percentile(p_A_samples, 5)upper_bound = np.percentile(p_A_samples, 95)

print 'There is 90%% probability that p_A is between %s and %s' % (lower_bound, upper_bound)

N_a = 1500 N_a = 50

Does the red have a larger probability of being clicked?

Page A

A Tea Pot

Page B

A Tea Pot

from pymc import Uniform, rbernoulli, Bernoulli, MCMC, deterministicfrom matplotlib import pyplot as plt

p_A_true = 0.05p_B_true = 0.04N_A = 1500N_B = 750

occurrences_A = rbernoulli(p_A_true, N_A)occurrences_B = rbernoulli(p_B_true, N_B)

print 'Observed frequency:'print 'A'print occurrences_A.sum() / float(N_A)print 'B'print occurrences_B.sum() / float(N_B)

Observed frequency:A0.0533333333333B0.0413333333333

p_A = Uniform('p_A', lower=0, upper=1)p_B = Uniform('p_B', lower=0, upper=1)

@deterministicdef delta(p_A=p_A, p_B=p_B):

return p_A - p_B

obs_A = Bernoulli('obs_A', p_A, value=occurrences_A, observed=True)obs_B = Bernoulli('obs_B', p_B, value=occurrences_B, observed=True)

mcmc = MCMC([p_A, p_B, obs_A, obs_B, delta])mcmc.sample(25000, 5000)[----- 14% ] 3561 of 25000 complete in 0.5 sec[--------- 25% ] 6332 of 25000 complete in 1.0 sec[------------ 33% ] 8454 of 25000 complete in 1.5 sec[--------------- 41% ] 10499 of 25000 complete in 2.0 sec[-----------------50% ] 12602 of 25000 complete in 2.5 sec[-----------------59%-- ] 14780 of 25000 complete in 3.0 sec[-----------------67%----- ] 16883 of 25000 complete in 3.5 sec[-----------------75%-------- ] 18954 of 25000 complete in 4.0 sec[-----------------83%----------- ] 20877 of 25000 complete in 4.5 sec[-----------------91%-------------- ] 22924 of 25000 complete in 5.0 sec[-----------------100%-----------------] 25000 of 25000 complete in 5.5 sec

p_A_samples = mcmc.trace('p_A')[:]p_B_samples = mcmc.trace('p_B')[:]delta_samples = mcmc.trace('delta')[:]

plt.subplot(3,1,1)plt.xlim(0, 0.1)plt.hist(p_A_samples, bins=35, histtype='stepfilled', normed=True, color='blue', label='Posterior of p_A')plt.vlines(p_A_true, 0, 90, linestyle='--', label='True p_A (unknown)')plt.xlabel('Probability of clicking BUY via A')plt.legend()plt.subplot(3,1,2)plt.xlim(0, 0.1)plt.hist(p_B_samples, bins=35, histtype='stepfilled', normed=True, color='green', label='Posterior of p_B')plt.vlines(p_B_true, 0, 90, linestyle='--', label='True p_B (unknown)')plt.xlabel('Probability of clicking BUY via B')plt.legend()plt.subplot(3,1,3)plt.xlim(0, 0.1)plt.hist(delta_samples, bins=35, histtype='stepfilled', normed=True, color='red', label='Posterior of delta')plt.vlines(p_A_true - p_B_true, 0, 90, linestyle='--', label='True delta (unknown)')plt.xlabel('p_A - p_B')plt.legend()plt.savefig('A_and_B.png')plt.show()

p_A > p_BHow much are we confident?

print 'Probability that p_A > p_B:'print (delta_samples > 0).mean()

Probability that p_A > p_B:0.8919

N_A = 1500N_B = 750

N_A = 1500N_B = 200

print 'Probability that p_A > p_B:'print (delta_samples > 0).mean()

Probability that p_A > p_B:0.73455

mcmc = MCMC([p_A, p_B, obs_A, obs_B, delta])mcmc.sample(25000, 5000)

Posterior P(p_A, p_B, delta | obs_A, obs_B) as samples

25000 iterations5000 burn-in

Metropolis-Hastings algorithm

Open the black box

mcmc = MCMC([p_A, p_B, obs_A, obs_B, delta])mcmc.sample(25000, 5000)

from pymc.Matplot import plot as mcplot

mcplot(mcmc)

• Easy to interpret results– confidence, no p-values!

• No crazy math• Computationally expensive

Thank you

@MrSantonimarcosantoni@hotmail.it

Serie A 13/14

Date HomeTeam AwayTeam FTHG FTAG FTR HTHG HTAG HTR24/08/2013 Sampdoria Juventus 0 1 A 0 0 D24/08/2013 Verona Milan 2 1 H 1 1 D25/08/2013 Cagliari Atalanta 2 1 H 1 1 D25/08/2013 Inter Genoa 2 0 H 0 0 D25/08/2013 Lazio Udinese 2 1 H 2 0 H25/08/2013 Livorno Roma 0 2 A 0 0 D25/08/2013 Napoli Bologna 3 0 H 2 0 H25/08/2013 Parma Chievo 0 0 D 0 0 D25/08/2013 Torino Sassuolo 2 0 H 1 0 H26/08/2013 Fiorentina Catania 2 1 H 2 1 H31/08/2013 Chievo Napoli 2 4 A 2 2 D31/08/2013 Juventus Lazio 4 1 H 2 1 H01/09/2013 Atalanta Torino 2 0 H 0 0 D01/09/2013 Bologna Sampdoria 2 2 D 1 1 D01/09/2013 Catania Inter 0 3 A 0 1 A01/09/2013 Genoa Fiorentina 2 5 A 0 3 A01/09/2013 Milan Cagliari 3 1 H 2 1 H01/09/2013 Roma Verona 3 0 H 0 0 D01/09/2013 Sassuolo Livorno 1 4 A 0 1 A01/09/2013 Udinese Parma 3 1 H 1 0 H14/09/2013 Inter Juventus 1 1 D 0 0 D14/09/2013 Napoli Atalanta 2 0 H 0 0 D14/09/2013 Torino Milan 2 2 D 0 0 D15/09/2013 Fiorentina Cagliari 1 1 D 0 0 D

https://datahub.io/dataset/italian-football-data-serie-a-b

Win-rate

Did it change?

Bayesian Worflow

1. Define Prior2. Fit to observations3. Get Posteriors

Winning a Match

Bernoulli distribution

𝑃 (𝑤 )={ 𝑝1−𝑝

𝑤=1𝑤=0

Win (w=1) Lose (w=0)0

0.10.20.30.40.50.60.70.8

𝑝 : switchpoint?

Model the switchpoint

𝑝={𝑝1𝑝2 𝑡<𝜏𝑡≥𝜏

Goal -> infer

Bayesian Worflow

1. Define Prior2. Fit to observations3. Get Posteriors

Let’s model this

• goal: infer unknown p1, p2, TAU• FIRST STEP OF Bayesian Inference: assign a prior

probability to different possible values of p• what would be a good prior for p1, p2? Use

uniform:– p1 ~ Uniform(0,1)– p2 ~ Uniform(0,1)– TAU ~ DiscreteUniform(1, 38)

• P(TAU=k)=1/38 for all k

from pymc import Uniform, DiscreteUniform, deterministic, Bernoulli, Model, MCMC

p_1 = Uniform('p_1', lower=0, upper=1)p_2 = Uniform('p_2', lower=0, upper=1)tau = DiscreteUniform('tau', lower=1, upper=38)

print 'Random output: ', tau.random(), tau.random(), tau.random()

Random output: 14 24 33

@deterministicdef p_(tau=tau, p_1=p_1, p_2=p_2, num_matches=38): # concatenate p_1 and p_2 based on tau out = np.empty(num_matches) out[:tau] = p_1 out[tau:] = p_2 return out

Load Data

import pandas as pd

df = pd.read_csv('serie_a.csv', parse_dates=['Date'], date_parser=parse_date)

matches = df[(df.HomeTeam == ‘Milan’) | (df.AwayTeam == ‘Milan’)]matches = matches.set_index(['Date'])matches = compute_extra_columns(matches, team)# some pandas manipulations occur herematches[‘Win’] = … # 1 if Milan won, 0 otherwise

Fit the Model

observed_matches = Bernoulli('obs', p=p_, value=matches[['Win']], observed=True)

model = Model([observed_matches, p_1, p_2, tau])mcmc = MCMC(model)mcmc.sample(40000, 10000)

p_1_samples = mcmc.trace('p_1')[:]p_2_samples = mcmc.trace('p_2')[:]tau_samples = mcmc.trace('tau')[:]

print p_1_samples[:10]print p_2_samples[:10]print tau_samples[:10][ 0.42067236 0.42067236 0.42067236 0.43900391 0.43900391 0.43900391 0.43900391 0.43900391 0.43900391 0.43900391][ 0.49213381 0.49213381 0.49213381 0.56072562 0.79863176 0.79863176 0.67416932 0.68382528 0.6069458 0.60062698][10 10 24 35 35 35 35 27 27 27]

plt.figure(figsize=(14.5, 10))ax = plt.subplot(311)ax.set_autoscaley_on(False)plt.hist(p_1_samples, histtype='stepfilled', alpha=0.85, label='posterior of p_1', color='#A60628', normed=True, bins=30)plt.legend(loc='upper left')ax = plt.subplot(312)plt.hist(p_2_samples, histtype='stepfilled', alpha=0.85, label='posterior of p_2', color='#7A68A6', normed=True, bins=30)plt.legend(loc='upper left')ax = plt.subplot(313)plt.hist(tau_samples, histtype='stepfilled', alpha=0.85, label='posterior of tau', color='#467821', normed=True, bins=30)plt.legend(loc='upper left')plt.show()

Expected Win Probability

num_matches = 38N = tau_samples.shape[0]expected_p_per_match = np.zeros(num_matches)for match in range(num_matches): ix = match < tau_samples p_samples_match = np.concatenate([p_1_samples[ix], p_2_samples[~ix]]) expected_p_per_match[match] = np.percentile(p_samples_match, 50)

Compute Confidence Bounds

lower_p_per_match = np.zeros(num_matches)upper_p_per_match = np.zeros(num_matches)for match in range(num_matches): ix = match < tau_samples p_samples_match = np.concatenate([p_1_samples[ix], p_2_samples[~ix]]) lower_p_per_match[match] = np.percentile(p_samples_match, 5) upper_p_per_match[match] = np.percentile(p_samples_match, 95)

Bayesian returns a distribution. What have we gained? We see uncertainty in our estimates. The wider the distribution, the less certain our posterior belief should be.

Applied Bayesian Inference with PyMC

Software

Bayesian Inference for Categorical Data Analysis: A Surveypeople.stat.sc.edu/Hitchcock/bayesfinal.pdf · Bayesian Inference for Categorical Data Analysis: ... Bayesian Inference for

Bayesian Inference & Neural Networks

Bayesian LSS Inference - mpe.mpg.de

Bayesian Inference - Michael Clark

Bayesian and frequentist inference for ecological ... · Key Words and Phrases: ecological inference, Bayesian inference, frequentist inference, voting patterns. 1 Introduction to

BAYESIAN INFERENCE Sampling techniques

Approximate Bayesian Inference I:

Aspects of Bayesian Inference

Bayesian inference method

DCM Bayesian Inference

Inference in Bayesian Nets

Bayesian Inference (I)

Statistical Physics of Inference and Bayesian Estimation · Statistical Physics of Inference and Bayesian Estimation Florent ... Bayesian versus frequentistThe Bayesian approach assumes

Bayesian Inference, Review 4/25/12 Frequentist inference Bayesian inference Review The Bayesian Heresy (pdf)pdf Professor Kari Lock Morgan Duke University

Robust Bayesian clustering - UCL Computer Science - · PDF fileRobust Bayesian clustering ... Bayesian learning, graphical models, approximate inference, variational inference,

Bayesian network inference

Bayesian Inference (II)

Bayesian Inference (II) - University of California, Santa Cruzabrsvn/intro_bayes_2.pdf · Bayesian Inference in a Nutshell (Again) In Bayesian inference, uncertainty or degree of

Bayesian inference on mixtures

VARIATIONAL BAYESIAN PHYLOGENETIC INFERENCE