Upload
bridget-phelps
View
227
Download
0
Tags:
Embed Size (px)
Citation preview
Lab 3b: Distribution of the meanOutline
•Distribution of the mean: Normal (Gaussian)– Central Limit Theorem
•“Justification” of the mean as the best estimate
•“Justification” of error propagation formula– Propagation of errors
– Standard deviation of the mean
640:104 - Elementary Combinatorics and Probability 960:211-212. Statistics I, II
Finite number of repeated measurements
1
1 N
ii
X x xN
The average value is the best estimation of true mean value, while the square of standard deviation is the best estimation of true variance.
22 2
1
1
1
N
x ii
x xN
The same set of data also provides the best estimation of the variance of the mean. 2
2 xx N
xx
N
x
x x
x x
The Central Limit Theorem
1 2 Nx x xx
N
The mean is a sum of N independent random variables:
NGP x P x
In probability theory, the central limit theorem (CLT) states that, given conditions, the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed.
The PDF of the mean is the PDF of the sum of N independent random variables.
*In practice, N just needs to be large enough.
Probability Distribution Function of the mean
1P x 1 2P x x
1 2 3P x x x 1 2 3 4P x x x x
An example of CLT
For this example N>4 is “enough”.
More examples of CLT
http://www.statisticalengineering.com/central_limit_theorem.html
E.g. N measurements xi (i=1, 2, …, N) of random variable x can be divided to m subsets of measurements {nj}, j=1,2,…,m.
Define the means of each subset as new measurements of new random variable y. If nj are large enough, the PDF of yj is approximately a Normal (Gaussian) one.
The significant consequence of CLT
Almost any repeated measurements of independent random variables can be effectively described by Normal (Gaussian) distribution.
1 2 mN n n n
1
1 jn
j j iij
y x xn
“Justification” of the mean as the best estimate
Assuming all the measured values follow a normal distribution GX,(x) with unknown parameters X and .
The probability of obtaining value x1 in a small interval dx1 is:
2 2/ 2,
1
2x X
XG e
2 22 21 1/ 2 / 2
1 1 1 1
1 1Prob between and
2x X x Xx x x dx e dx e
Similarly, the probability of obtaining value xi (i=0, …, N) in a small interval dxi is:
2 22 2/ 2 / 21 1Prob between and
2i ix X x X
i i i ix x x dx e dx e
constant
Justification of the mean as best estimate (cont.)
The probability of N measurements with values: x1, x2, … xN:
2 2
1
, 1 1 2
/ 2
Prob , , Prob Prob Prob
1N
ii
X N N
x X
N
x x x x x
e
Here x1, x2, … xN are known results because they are measured values. On the other hand, X and are unknown parameters. We want to find out the best estimate of them from ProbX, (x1, x2, … xN ). One of the reasonable procedures is the principle of maximum likelihood, i.e. maximizing ProbX, (x1, x2, … xN ) respect to X minimizing 2 2
1
/ 2N
ii
x X
1 1
10 best estimate of
N N
i ii i
x X X x xN
Best estimate of varianceSimilarly, we could maximize ProbX, (x1, x2, … xN ) respect to to get the best estimate of :
2 22 22/ 2 / 2
1 31
2
21
2
1
1 10
10
1best estimate of
i iN
x X x X
iN Ni
N
ii
N
ii
Ne x X e
N x X
x XN
2
1
1best estimate of
1
N
ii
x xN
However, we don’t really know X. In practice, we have to replace X with the mean value, which reduces the accuracy. To account for this factor, we have to replace N with N-1.
q x A Measured quantity plus a constant:
q Bx
Measured quantity multiplies a fixed #:
q x y Sum of two measured quantities:
Justification of error propagation formula (I)
q x y
P q P x P y dxdy P x P q x dx
2 2/ 2
2
xx
x
eP x
Assuming X=Y=0, 2 2/ 2
2
yy
y
eP y
, = q xq X A
, = q xq B X B
q x y Sum of two measured quantities:
Justification of error propagation formula (II)
2 2q x y
2 22 2 / 2/ 2
2 2
yx q xx
x y
e eP q P x P q x dx dx
22
2 22 21
2x y
q xx
x y
e dx
2
2 22
2 2
1
2
x y
q
x y
e
Justification of error propagation formula (III)
The general case: . q q x
Assuming are small enough so that we are concerned
only with values of close to .x
x X
Using Taylor expansion:
qq x q X x X
x
q qq X X x
x x
2
q x x
q q
x x
constant
Justification of error propagation formula (IV)
The more general case: , . q q x y
Assuming and are small enough so that we are concerned
only with values of close to and close to .
x y
x X y Y
Using Taylor expansion:
, ,q q
q x y q X Y x X y Yx y
,q q q q
q X Y X Y x yx y x y
22
q x y
q q
x y
Standard deviation of the mean
22 2
1 2
Assuming they are normally distributed and have the same and .x
x x x xN
X
q q q
x x x
1 2 Nx x xx
N
2 2 221 1 1 1 x
x x x x xN N N N N
xx
N
Lab 3b: test the CLT1. One run of decay rate (R=N/Δt) measurement
a. Sampling rate: 0.5 second/sample (Δt)
b. Run time: 30 seconds (# of measurements: n=60)
c. calculate , plot histogram.
2. Repeat the 30-sec run 20 times (a set of runs)a. Remember to save data
b. Plot histograms of 21 means with proper bin widths
c. Make sure you answer all the questions
3. Sample size (n=T/ Δt) dependence– Keep the same configuration, vary run time
–
* “Origin” (OriginLab®) is more convenient than Matlab.
, ,N NN
Record and , plot vs. log with error bar = .N NN N n
http://www.physics.rutgers.edu/rcem/~ahogan/326.html