Bootstrap - cholmcc.gitlab.io · We can implement a general solution to the bootstrap method in Python. The first thing we need is the ability to make our simulation samples. Here,

Christian Holm Christensen

BootstrapEstimate statistical uncertainties

Version 0.1 - May 2019 (English)

Abstract

Bootstrapping is a method to estimate the statistical uncertainty of some quantity when straight-forward error propagation is not feasible. In this note, we will investigate the technique of boot-strapping through example using PythonThis document is available in many formats at https://cholmcc.gitlab.io/nbi-python

Niels Bohr Institutet

ii Bootstrap

iii

Contents

1 Introduction 1

2 The method 1

3 Implementation 1

4 Intermezzo - some helpers 2

5 Example 3

6 Simulating 4

7 Confidence intervals 57.1 Normal confidence interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57.2 Quantile confidence interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77.3 Pivotal confidence interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

8 Another example 8

9 Another (simpler) approach - Jackknife 109.1 Example - LSAT versus GPA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119.2 Example - generated data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

10 When not to do bootstrap or jackknife 1210.1 A simple estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1210.2 A more complicated example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1410.3 Estimates from full event data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

11 Summary 18

iv Bootstrap

1 Introduction 1

1 Introduction

The technique of bootstrapping is a way to estimate the statistical uncertainty of some quantity. Itis most often used when the variance of the quantity (or more formally estimator) is not feasible tocalculate directly from the data. Some examples are

• The median of a variable• Correlation coefficient between two variables

or more complicated quantities such as the azimuthal anisotropic flow calculated from so-called Q-cumulants (see f.eks. here). That is, if we are estimating a simple quantity like the mean of a sample,we would not use the bootstrap method, since the variance of the mean

Var[x] =Var[x]

N,

with sample size N, is easily computed from the sample directly.

2 The method

The method of bootstrapping was invented by Bradly Efron (see for example L.Wasserman All ofStatistics, Chapter 8), goes roughly like this

• Suppose we are interested in the quantity T calculated over the data X (more formally, T is astatistics). We estimate T via the estimator T over the sample X1, X2, . . . XN, of size N. We areinterested in estimating the variance Var(T(X))

• First, estimate the T over our sample X• Secondly, for some number of iterations B, do

– Select, at random with replacement, N samples from the original sample– Calculate the estimate T over this sample

• Finally, calculate the variance of T estimated over the B generated samples.

The underlying reasoning hinges on the law of large numbers, which says that for sufficiently largenumber of independent identitical distributed variable the distribution will tend to a normal distribu-tion. Thus, by making B (a large number) of simulations, we can approximate the original estimatorvariance by the variance of the simulations.Each simulation is performed by sampling the original sample X1, X2, . . . , XN exactly N times withreplacement. By with replacement we mean that the probability to draw Xi is exactly 1/N for all Nsamples in the simulation. Thus, in our simulation sample the multiplicty of any Xi is anywhere from0 to N.

3 Implementation

We can implement a general solution to the bootstrap method in Python. The first thing we need is theability to make our simulation samples. Here, we can use the standard function random.choices. Tosee this, let us pick as the sample the numbers between 0 and 9 (inclusive), and make some simulationsamplesimport randomrandom.seed(123456)data = list(range(10))for _ in range(5):

print(random.choices(data,k=len(data)))

http://gitlab.com/cholmcc/mcorrelations

http://www.stat.cmu.edu/~larry/all-of-statistics/


2 Bootstrap

[8, 7, 0, 1, 0, 6, 0, 2, 1, 2][4, 1, 0, 9, 3, 6, 3, 8, 7, 4][4, 8, 5, 5, 8, 0, 7, 0, 5, 0][8, 7, 5, 4, 7, 3, 3, 2, 1, 8][2, 2, 1, 9, 4, 3, 9, 0, 4, 1]

Secondly, our solution will need to accept some estimator function T to operator on our simulationsamples. We will simple take that as an argument in form of a callable. The final input to our solutionin the choice of number of simulations B. Since we may want to calculate other statistics than thevariance on our bootstrap sample, we will return the entire list of B estimates of T over all simulations.Thus, our solution becomesdef bootstrap(data,estimator ,size=1000,*args):

"""Perform the bootstrap simulation run.

Parameters----------data :

The data to analyse. This can be any indexable object. That is, we must beable to do

>>> v = data[i]

estimator : callableThis function is evaluated over data (with the same type as the data argument)repeatedly to calculate the estimator on the bootstrap simulation. It mustaccept a single argument of the same type as data. Additional arguments canbe passed in the args argument.

size : int, positiveThe number of bootstrap simulations. This number should be large (>1000).

*args : dictAdditional arguments to pass to estimator function

Returns-------value : generator

The estimator function evaluated over size bootstrap simulations. One cancalculate the variance of this list to get the estimate of the estimatorvariance

"""from random import choices

def _inner(data,estimator):"""Inner function to generate simulation sample and evaluate the estimator"""return estimator(choices(data,k=len(data)),*args)

return (_inner(data,estimator) for _ in range(size))

Thus, to calculate the bootstrap estimate of the variance of an estimator, we simply pass in ourindexable data and our estimator function, get back a generator (which we can evaluate immediatelyusing list, if needed) on which we can calculate the variance.

4 Intermezzo - some helpers

Below we want to calculate the variance and quantiles of samples, so we will define a few helperfunctions. The first one will return the mean and the variance of a sampledef meanVar(x,ddof=0):

n = len(x)m = sum(x)/nv = sum([(xx-m)**2 for xx in x])/(n-ddof)return m,v

5 Example 3

Again, we could have used NumPy for this, but for the sake of illustration we code it up ourselves.Let us make a sample ∼ U(0, 1) and calculate the mean (0.5) and variance (1/12):m, v = meanVar([random.random() for _ in range(1000)])print('{:.3f} +/- {:.3f} (expect {:.3f} and {:.3f})'.format(m,v,.5,1/12))

0.500 +/- 0.082 (expect 0.500 and 0.083)

The next function will calculate the α quantile of a sample. Essentially what we need to do is orderthe data and return the element at index αN where N is the number of samples.def quantile(x,alpha,key=None):

return sorted(x,key=key)[int(alpha*len(x))]

Let us try this on a sample ∼ N(0, 1)

x = [random.normalvariate(0,1) for _ in range(100)]print(' 5% quantile: {:+.3f}'.format(quantile(x,.05)))print('50% quantile: {:+.3f}'.format(quantile(x,.50)))print('95% quantile: {:+.3f}'.format(quantile(x,.95)))

5% quantile: -1.80750% quantile: -0.10395% quantile: +2.103

5 Example

The following example is due to Bradly Efron (reproduced in L.Wasserman All of Statistics, Chapter8). A law school is interested in the correlation between LSAT (Law School Achievement Test) andGPA (Grade Point Average) scores. That is

θ =∑i(Yi − Y)(Zi − Z)√

∑i(Yi − Y)2√

∑i(Zi − Z)2,

where Y is the LSAT score, and Z the GPA score. First, let us get some data to work on.lsat=[576, 635, 558, 578, 666, 580, 555, 661, 651, 605, 653, 575, 545, 572, 594]gpa =[3.39,3.30,2.81,3.03,3.44,3.07,3.00,3.43,3.36,3.13,3.12,2.74,2.76,2.88,2.96]

Here, we have 15 samples of correlated LSAT and GPA scores.We need a function calculate the correlation θ

import mathdef corr(y,z):

ym,yv = meanVar(y)zm,zv = meanVar(z)num = sum([(yi-ym)*(zi-zm) for yi,zi in zip(y,z)])/len(y)den = math.sqrt(yv)*math.sqrt(zv)theta = num / denreturn theta

We could have used NumPy here to perform this calculation more easily, but for the sake of illustrationwe write it out.Now, our general bootstrap function expects the callable to take a single data argument, so we willwrap corr in another function below. Let us write a function that calculates θ and estimates thestandard deviation of θ using the bootstrap method.def corrLsatGpa(lsat,gpa,b=1000):

def est(data):"""Wrapper"""


https://www.lsac.org/lsat

https://en.wikipedia.org/wiki/Grading_in_education#United_States

4 Bootstrap

y = [lsat for lsat,_ in data]z = [gpa for _,gpa in data]return corr(y,z)

theta = corr(lsat,gpa) # The estimatedata = [[lsat,gpa] for lsat,gpa in zip(lsat,gpa)] # Retructureboot = list(bootstrap(data,est,b)) # Get the bootstrap estimatesbm, bv = meanVar(boot)return theta,math.sqrt(bv),boot

The function above returns

• θ the estimate of the correlation,• se(θ) =

√Varboot[θ] the bootstrap estimate of the standard deviation, and

• the estimates of θ over the bootstrap simulations.

The last return value is mainly done in the interest of visualising the simulation. Let us run theexample and plot

• The correlation of the LSAT and GPA score• The bootstrap estimates of θ together with the estimates

Let’s make a function to plot resultsimport matplotlib.pyplot as pltdef plot1(ax,data,theta,se,label):

b,_,_ = ax.hist(data, density=True, alpha=.5,label=label,histtype='step')t = max(b)ax.plot([theta,theta],[0,t],'--r',label='Estimate')ax.fill_between([theta-se,theta+se],[t,t],alpha=.5,

label=r'$\widehat{\mathrm{se}}$',color='y')ax.set_xlabel(r'$\hat\theta$')ax.legend()print('{:10s} {:.5f} +/- {:.5f}'.format(label, theta, se))

theta, std, boot = corrLsatGpa(lsat,gpa)print('Correlation between LSAT and GPA: {:.3f} +/- {:.3f}'.format(theta,std))

fig, ax = plt.subplots(ncols=2,figsize=(10,6))

ax[0].plot(lsat,gpa,'o')ax[0].set_xlabel('LSAT')ax[0].set_ylabel('GPA')ax[0].set_title('The data')plot1(ax[1],boot,theta,std,'Bootstrap')fig.tight_layout();

Correlation between LSAT and GPA: 0.776 +/- 0.132Bootstrap 0.77637 +/- 0.13242

(Figure 1)

6 Simulating

It is worth noting, that the method of bootstrapping is based on the law or large numbers. That iswhat necessitates that we perform a relative large number of simulations to get an estimate of thevariance of our estimator.

7 Confidence intervals 5

540 560 580 600 620 640 660LSAT

2.8

2.9

3.0

3.1

3.2

3.3

3.4G

PA

The data

0.2 0.4 0.6 0.8 1.00.0

0.5

1.0

1.5

2.0

2.5

3.0EstimateBootstrapse

Figure 1: Left: GPA versus LSAT data. Right: Bootstrap estimate of uncertainty

To see this, let us run the above example with a varying number of steps ranging from 3 to 10000 andthen plot the estimated standard deviation as a function of the number of steps.

bs = [3, 6, 10, 30, 60, 100, 300, 600, 1000, 3000, 6000, 10000]ob = []os = []for b in bs:

ob.append([b]*10)os.append([corrLsatGpa(lsat,gpa,b)[1] for _ in range(10)])

plt.figure(figsize=(10,6))plt.scatter(ob,os)plt.xscale('log')plt.xlabel('$B$')plt.ylabel(r'$\widehat{se}(\hat\theta)$')plt.title('Uncertainty estimate as a function of number of simulations');

(Figure 2)

The exact shape of the curve depends on the state of the random number generator used by random.choices,but in general we see that the estimate of se(θ) does not stabilize until B is sufficiently large. Thus,we must ensure sufficiently large number of simulations when applying the bootstrap method, or ourestimate of the variance of the estimator is wholly uncertain.

7 Confidence intervals

We can estimate confidence intervals from our bootstrap estimate of the variance in three ways

7.1 Normal confidence interval

In this method, we assume that the estimator is roughly normal, and we can give the standard 2σconfidence limits

6 Bootstrap

101 102 103 104

B

0.05

0.10

0.15

0.20

0.25

se(

)

Uncertainty estimate as a function of number of simulations

Figure 2: Uncertainty estimate se(θ) versus number of simulations B

θ − 2se(θ), θ + 2se(θ) .

Let us code this up in a function.

def bootstrapNormalCL(theta,boot,z=2):"""Calculate the normal confidence limits on the estimatetheta and bootstrap sample boot. z specifies the number ofstandard errors

Parameters----------theta : value

Estimateboot : data

Bootstrap samplez : factor

Number of standard errors

Return------low, high : tuple

Confince interval"""_, var = meanVar(boot)se = math.sqrt(var)return theta - z * se, theta + z * se

Let us calculate the confidence interval for the LSAT versus GPA example above

nlim = bootstrapNormalCL(theta,boot)print('Confidence limits (normal): {:.3f},{:.3f}'.format(*nlim))

Confidence limits (normal): 0.512,1.041

7 Confidence intervals 7

7.2 Quantile confidence interval

An alternative, which does not assume θ is roughly normal, but will tend to underestimate the confi-dence range is to calculate the α and 1 − α quantiles. That is, we quote the confidence limits as

Qα(θ), Q1−α(θ) ,

where Qα is the α quantile of the bootstrap samples. Again, we will code this up in a function.def bootstrapQuantileCL(theta,boot,alpha=0.05):

"""Calculate the quantile confidence limits on the estimatetheta and the bootstrap sample boot, where alpha is the percentilebelow and above


Estimateboot : data

Bootstrap samplealpha : percentage

Percentage below and above the confidence limits


Confidence interval"""return quantile(boot,alpha), quantile(boot,1-alpha)

Let us, again, calculate the 5% and 95% confidence limits on the LSAT versus GPA example aboveqlim = bootstrapQuantileCL(theta,boot ,0.05)print('Confidence limits (quantile): {:.3f},{:.3f}'.format(*qlim))

Confidence limits (quantile): 0.531,0.950

7.3 Pivotal confidence interval

This method uses the estimate θ and the α quantiles of the bootstrap simulations, and give theconfidence limits as

2θ − Q1−α(θ), 2θ − Qα(θ) ,

where Qα is the α quantile of the bootstrap samples. We code this up in a function.def bootstrapPivotCL(theta,boot,alpha=0.05):

"""Calculate the quantile confidence limits on the estimatetheta and the bootstrap sample boot, where alpha is the percentilebelow and above


Estimateboot : data

Bootstrap samplealpha : percentage

Percentage below and above the confidence limits


8 Bootstrap

0.5 0.6 0.7 0.8 0.9 1.0

Normal (2 )

Quantile (5-95)%

Pivot (5-95)%

Estimate

Figure 3: Comparison of confidence interval estimates from Bootstrap

Confidence interval"""return 2*theta - quantile(boot,1-alpha),2*theta-quantile(boot,alpha)

And, we use this on the example aboveplim = bootstrapPivotCL(theta, boot, 0.05)print('Confidence limits (pivot): {:.3f},{:.3f}'.format(*plim))

Confidence limits (pivot): 0.603,1.022

For comparison we will plot these limits togetherplt.figure(figsize=(8,4))

plt.plot([theta,theta],[-.5,2.5],'--r',label='Estimate')for i, l in enumerate([nlim,qlim,plim]):

plt.plot(l,[i,i],'-',lw=3)plt.yticks([0,1,2],[r'Normal ($2\sigma$)',

'Quantile (5-95)%','Pivot (5-95)%'])

plt.xlabel(r'$\hat\theta$')plt.legend();

(Figure 3)

We note that the normal and pivot confidence limits exceed 1 on the high end, which indicate thatthese two estimates tend to overestimate the size of the confidence interval. The quantile confidencelimits, on the other hand is probably on the low side, but does reflect the distribution of the bootstrapsample in this example.

8 Another example

This example comes from the exercises of L.Wasserman All of Statistics, Chapter 8.We have a sample of 100 observations X ∼ N(5, 1), and we are interested in the statistics θ = eµ, forwhich we will use the estimator θ = eX. We will use the bootstrap method to calculate the standarduncertainty and 95% confidence limits on θ.First, let us make our sample, and calculate our estimator


8 Another example 9

0 500 1000 1500 20000.000

0.005

0.010

0.015

0.020

0.025

0.030

0.035EstimateNormalQuantilePivotDataBootstrapse

Figure 4: Bootstrap estimate of uncertainty on eX, as well as confindence limits

data = [random.normalvariate(5,1) for _ in range(100)]theta = math.exp(sum(data)/len(data))print('hat(theta) = {:.2f}'.format(theta))

hat(theta) = 169.10

Next, we generate our bootstrap sample and calculate the standard uncertainty and confidence limitsusing all three methods above and plot them with the distribution of eX as well as the bootstrapdistribution.boot = list(bootstrap(data,lambda d:math.exp(sum(d)/len(d))))_, var = meanVar(boot)

plt.figure(figsize=(8,4))plt.hist([math.exp(x) for x in data],25,alpha=.5,density=True,label='Data')plot1(plt.gca(),boot,theta,math.sqrt(var),'Bootstrap')

cln = ['Normal','Quantile','Pivot']clm = [bootstrapNormalCL ,bootstrapQuantileCL ,bootstrapPivotCL]for n, m, y in zip(cln,clm,[.03, .032, .034]):

l = m(theta,boot)plt.plot(l,[y,y],'-',lw=4,label=n)print('{:10s} confidence limits: {:.2f} - {:.2f}'.format(n,*l))

plt.legend();

Bootstrap 169.10419 +/- 16.40896Normal confidence limits: 136.29 - 201.92Quantile confidence limits: 143.82 - 197.33Pivot confidence limits: 140.88 - 194.39

(Figure 4)

We immediately see that the bootstrap sample is much more narrowly centred around the estimate,and the width of the distribution reflect well the expected variance

Var[θ] ≈(

∂θ

∂x

)2

δ2 x = e2x Var[x] = e2x Var[x]N

,

10 Bootstrap

which, taking the square root, evaluates to

mx, vx = meanVar(data)print('{:.3f}'.format(math.sqrt(math.exp(mx)**2*vx/len(data))))

15.820

9 Another (simpler) approach - Jackknife

This approach was developed by Maurice Quenouille (see appendix to L.Wasserman All of Statistics,Chapter 8) and predates the bootstrap method. The idea is again to use the observed data to simulatevariations in the sample and then estimate the sample variance from these simulations.

• Suppose we are interested in the quantity T calculated over the data X (more formally, T is astatistics). We estimate T via the estimator T over the sample X1, X2, . . . XN, of size N. We areinterested in estimating the variance Var(T(X))

• First, estimate the T over our sample x

• Secondly, for N iterations, we do

– For the ith iteration, calculate the estimate T leaving out the ith data point. That is wetake the sample X1, . . . , Xi−1, Xi+1, . . . , XN and calculate the estimate on that sample

• Finally, calculate the variance of T estimated over the N generated samples given by

Var[T] =N − 1

N

N

∑i(Ti − T)2 ,

where Ti is the estimate calculated over the ith jackknife sample and T is the mean of the estimatecalculated over all jackknife samples.

We can code this up in a general function. As before, we expect an indexable data set and a functionto calculate the estimator.

def jackknife(data,estimator ,*args):"""Generate the jackknife samples and evaluate the estimator overthese.

Parameters----------data :

The data to calculate the jackknife samples overestimator : callable

The function to calculate the estimator

Returns-------jack :

The estimator calculated over all jackkknife samples"""def _inner(data,estimator ,i):

return estimator((data[j] for j in range(len(data)) if j != i),*args)

return (_inner(data,estimator ,i) for i in range(len(data)))


9 Another (simpler) approach - Jackknife 11

0.65 0.70 0.75 0.80 0.85 0.900

2

4

6

8

10

12

14

16 EstimateJackknifese

Figure 5: Jackknife estimate of uncertainty on LSAT versus GPA correlation

9.1 Example - LSAT versus GPA

Let us apply this method to our example above of the correlation between LSAT and GPAdef jkLsatGpa(lsat,gpa):

def est(data):"""Wrapper"""d = list(data)y = [lsat for lsat,_ in d]z = [gpa for _,gpa in d]return corr(y,z)

theta = corr(lsat,gpa) # The estimatedata = [[lsat,gpa] for lsat,gpa in zip(lsat,gpa)] # Retructurejk = list(jackknife(data,est)) # Get the bootstrap estimatesjm, jv = meanVar(jk)return theta,math.sqrt(jv*(len(data)-1)),jk

We run this example and compare to the previous result of 0.776 ± 0.127

theta, std, jk = jkLsatGpa(lsat,gpa)print("LSAT versus GPA correlation: {:.3f} +/- {:.3f}".format(theta,std))

plt.figure()plot1(plt.gca(),jk,theta,std,'Jackknife')

LSAT versus GPA correlation: 0.776 +/- 0.143Jackknife 0.77637 +/- 0.14252

(Figure 5)

9.2 Example - generated data

We use the jackknife method on our generated data from above. First, we calculate our estimateθ = eX of the sample X ∼ N(5, 1)

12 Bootstrap

155 160 165 170 175 180 1850.00

0.05

0.10

0.15

0.20

0.25EstimateJackknifese

Figure 6: Jackknife estimate of uncertainty on eX

theta = math.exp(sum(data)/len(data))

which is clearly the same as before, and then we perform our jackknife analysis to finde the variance.We plot the result as beforedef est(data):

d = list(data)return math.exp(sum(d)/len(d))

jk = list(jackknife(data,est))_, var = meanVar(jk)std = math.sqrt(var*(len(data)-1))plot1(plt.gca(),jk,theta,std,'Jackknife')

Jackknife 169.10419 +/- 15.88762

(Figure 6)

Clearly, the jackknife method does not produce as wide simulated distributions as the bootstrapmethod does, and consequently, the estimates of the variance are more uncertain. If possible, oneshould opt for the bootstrap method over the jackknife method.

10 When not to do bootstrap or jackknife

10.1 A simple estimator

Suppose we have analysed millions of events {E1, . . .} for a particular observable X. We have split ourevents Ei into N sub-samples

N⋃i

Si = {E1, . . .} ,

and calculated X in each of these sub-samples. We thus have the sample

10 When not to do bootstrap or jackknife 13

{X1, . . . , XN} .

All the observations Xi are independent identically distributed (iid) random variable, in that

∀i, j ∈ {1, . . . , N} ∧ i = j : Sj ∩ Si = ∅ ,

and the events Ei are assumed to be equal ins some meaning of that word. Thus, we want to estimateθ over and its variance. Our estimator is then the mean of the N samples

θ =1N

N

∑i=1

Xi ,

and we will use the bootstrap and jackknife methods for estimating the variance and standard uncer-tainty.Here, we will choose N = 10 and X ∼ N(0, 1) without loss of generality. Thus, we expect to find that

θ = 0 ± 0.1 .

Let us generate the datadata = [random.normalvariate(0,1) for _ in range(10)]

We can of course calculate the mean and the variance directly from this sample to obtained the samplemean and standard uncertaintym, v = meanVar(data,1)e = math.sqrt(v/len(data))mes = '{:10s} mean = {:.3f} and variance = {:.3f} -> {:.3f} +/- {:.3f}'print(mes.format('Sample', m, v, m, e))

Sample mean = 0.070 and variance = 0.530 -> 0.070 +/- 0.230

Let us now use the bootstrap method to perform the calculationdef est(data):

d = list(data)return sum(d)/len(d)

boot = list(bootstrap(data, est))meanb, varb = meanVar(boot)eb = math.sqrt(varb)print(mes.format('Bootstrap', m, varb, m, eb))

Bootstrap mean = 0.070 and variance = 0.048 -> 0.070 +/- 0.220

And finally we use the jackknife methodjk = list(jackknife(data,est))meanj, varj = meanVar(jk)ej = math.sqrt(varj*(len(data)-1))print(mes.format('Jackknife', m, varj, m, ej))

Jackknife mean = 0.070 and variance = 0.006 -> 0.070 +/- 0.230

Let us plot the various samplesfig, ax = plt.subplots(ncols=3,figsize=(10,6),sharex=True)plot1(ax[0], data, m, e, 'Direct')plot1(ax[1], boot, m, eb, 'Bootstrap')plot1(ax[2], jk, m, ej, 'Jackknife')fig.tight_layout()

14 Bootstrap

1.0 0.5 0.0 0.5 1.00.0

0.2

0.4

0.6

0.8

1.0EstimateDirectse

1.0 0.5 0.0 0.5 1.00.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

EstimateBootstrapse

1.0 0.5 0.0 0.5 1.00

2

4

6

8

EstimateJackknifese

Figure 7: Estimates of uncertainty se(θ) with θ simple Left: direct evaluation, middle: Bootstrap,right: Jackknife

Direct 0.07026 +/- 0.23028Bootstrap 0.07026 +/- 0.21978Jackknife 0.07026 +/- 0.23028

(Figure 7)

As is clear from the results above, it makes little sense to use the bootstrap or jackknife methods forestimating the variance if the estimator in question is a simple estimator such as the mean.

10.2 A more complicated example

Suppose, again, we are analysing millions of events which we may split into some number N of sub-samples. For each sub-sample i we calculate some quantity from which we will derive a complicatedestimator θi. This could for example be

θi =−a + 2b

c3 ,

where a, b, and c are calculated over the sub-samples. The final estimator over the sub-samples isthen the average

θ =1N ∑

iθi .

Let us try to simulate this case. We will generate 10000 events with

• a ∼ N(1, 1)• b ∼ N(5, 1)• c ∼ N(3, 1)


from which we will select N = 10 sub-samples and calculate the means of a, b, and c.

events = [(random.normalvariate(1,1),random.normalvariate(5,1),random.normalvariate(3,1))for _ in range(1000)]

data = list(zip(*[events[i::len(events)//10]for i in range(len(events)//10)]))

data = [(sum(a for a,_,_ in sub)/len(sub),sum(b for _,b,_ in sub)/len(sub),sum(c for _,_,c in sub)/len(sub))

for sub in data]

Let us define the estimator function which calculates the average over θi, and evaluate it on the 10subsamples

def est(data):def _inner(a,b,c):

return (-a + 2*b) / c**3d = list(data)return sum(_inner(a,b,c) for a,b,c in d)/len(d)

theta = est(data)print('Estimator {}'.format(theta))

Estimator 0.3319428026358412

Let us do the bootstrap and jackknife methods to estimate the variance of θ, as well as direct estimatefrom the N sub-sample results

dirc = [(-a + 2*b)/c**3 for a,b,c in data]dmean,dvar = meanVar(dirc)dstd = math.sqrt(dvar / len(dirc))

boot = list(bootstrap(data,est))bmean, bvar = meanVar(boot)bstd = math.sqrt(bvar)

jack = list(jackknife(data,est))jmean, jvar = meanVar(jack)jstd = math.sqrt(jvar * (len(data)-1))

fig, ax = plt.subplots(ncols=3,figsize=(10,6),sharex=True)plot1(ax[0],dirc,theta,dstd,'Sub-samples')plot1(ax[1],boot,theta,bstd,'Bootstrap')plot1(ax[2],jack,theta,jstd,'Jackknife')fig.tight_layout()

Sub-samples 0.33194 +/- 0.00882Bootstrap 0.33194 +/- 0.00842Jackknife 0.33194 +/- 0.00929

(Figure 8)

Again, we see that the bootstrap and jackknife methods does not provide significant advantages overdirect calculation of the variance from the N sub-samples. This is, of course, because the final estimatoris a simple average of the sub-samples.

10.3 Estimates from full event data

We will continue the example above, where we however will store a, b, c as calculated in each eventand our final estimator becomes

16 Bootstrap

0.28 0.30 0.32 0.34 0.36 0.380

5

10

15

20

25

30

EstimateSub-samplesse

0.28 0.30 0.32 0.34 0.36 0.380

10

20

30

40

EstimateBootstrapse

0.28 0.30 0.32 0.34 0.36 0.380

50

100

150

200

250

EstimateJackknifese

Figure 8: Estimates of uncertainty se(θ) with θ complicated. Left: direct evaluation, middle: Boot-strap, right: Jackknife

θ =−a + 2b

c3 .

We will thus perform the bootstrap analysis by sampling new events from our empirical distributionsof a, b, and c and calculate the estimator value for each of those samples. Note, in this case, it is noteasy to calculate the variance directly from the data, so we will refrain from doing so.We use the events generated above to do our estimate and variance estimates, but first, we need afunction to calculate the mean of a, b, and c over all events.def est(data):

d = list(data)a = sum(aa for aa,_,_ in d) / len(d)b = sum(bb for _,bb,_ in d) / len(d)c = sum(cc for _,_,cc in d) / len(d)return (-a + 2*b) / c**3

Let us calculate the estimatetheta = est(events)print('Estimator: {}'.format(theta))

Estimator: 0.3304604423307452

Regular propagation of uncertainties, including covariance, gives

Var[θ] =(− 1

c3

)2 Var[a]N

+

(2c3

)2 Var[b]N

+

(−3(−a + 2b)

c4

)2 Var[c]N

+ 2[−1c3

2c3

Cov[a, b]N

+−1c3

−3(−a + 2b)c4

Cov[a, c]N

+2c3

−3(−a + 2b)c4

Cov[b, c]N

]=

1c6N

(Var[a] + 4(Var[b]− Cov[a, b]) +

3c(−a + 2b)

[3(−a + 2b)

cVar[c] + Cov[a, c]− 2 Cov[b, c]

]),

which we can calculate directly on the data


0.30 0.31 0.32 0.33 0.34 0.35 0.36 0.370

5

10

15

20

25

30

35 EstimateBootstrapse

0.30 0.31 0.32 0.33 0.34 0.35 0.36 0.370

200

400

600

800

1000

EstimateJackknifese

Figure 9: Estimates of uncertainty se(θ) with samples from full event data. Left: Bootstrap, right:Jackknife

n = len(data)meana, vara = meanVar([aa for aa,_,_ in data])meanb, varb = meanVar([bb for _,bb,_ in data])meanc, varc = meanVar([cc for _,_,cc in data])covab = sum([(aa - meana)*(bb - meanb) for aa,bb,_ in data])/ncovac = sum([(aa - meana)*(cc - meanc) for aa,_,cc in data])/ncovbc = sum([(bb - meanb)*(cc - meanc) for _,bb,cc in data])/ntmp = 3*(-meana + 2*meanb)/meancdvar = 1/meanc**6/n * (vara + 4*(varb+covab) + tmp*(tmp*varc+covac -2*covbc))dstd = math.sqrt(dvar)print('{:10s} {:.5f} +/- {:.5f}'.format('Direct',theta,dstd))

Direct 0.33046 +/- 0.00884

We perform the bootstrap and jackknife analysesboot = list(bootstrap(events,est))jack = list(jackknife(events,est))bmean, bvar = meanVar(boot)jmean, jvar = meanVar(jack)bstd = math.sqrt(bvar)jstd = math.sqrt(jvar * (len(events)-1))

fig, ax = plt.subplots(ncols=2,sharex=True,figsize=(10,6))print('{:10s} {:.5f} +/- {:.5f}'.format('Direct',theta,dstd))plot1(ax[0],boot,theta,bstd,'Bootstrap')plot1(ax[1],jack,theta,jstd,'Jackknife')fig.tight_layout()

Direct 0.33046 +/- 0.00884Bootstrap 0.33046 +/- 0.01103Jackknife 0.33046 +/- 0.01103

(Figure 9)

The 3 estimates all agree to one significant digit, and we see that even in this case we do better by

18 Bootstrap

evaluating the uncertainties directly.

11 Summary

The bootstrap and jackknife methods for estimating the variance of an estimator are powerful tools,but are not generally applicable. Here are some key take-aways

• Bootstrap should be preferred over jackknife• Bootstrap and jackknife should only be applied if it is not possible to estimate the estimator

variance through regular propagation of uncertainties or the like• If bootstrapping is used, one best save the variables that go into the final calculation of the

estimator, so that one can reliably perform the simulations.

Documents

Bootstrap - cholmcc.gitlab.io · We can implement a general solution to the bootstrap method in Python. The first thing we need is the ability to make our simulation samples. Here,