Chapter 3 Selected Basic Concepts in Statistics n Expected
Value, Variance, Standard Deviation n Numerical summaries of
selected statistics n Sampling distributions
Slide 3
Expected Value Weighted average Not the value of y you expect;
a long-run average
Slide 4
E(y) Example 1 Toss a fair die once. Let y be the number of
dots on upper face. y123456 p(y)1/6
Slide 5
E(y) Example 2: Green Mountain Lottery Choose 3 digits between
0 and 9. Repeats allowed, order of digits counts. If your 3-digit
number is selected, you win $500. Let y be your winnings (assume
ticket cost $0) y$0$500 p(y)0.9990.001
Slide 6
US Roulette Wheel and Table n The roulette wheel has
alternating black and red slots numbered 1 through 36. n There are
also 2 green slots numbered 0 and 00. n A bet on any one of the 38
numbers (1-36, 0, or 00) pays odds of 35:1; that is... n If you bet
$1 on the winning number, you receive $36, so your winnings are $35
American Roulette 0 - 00 (The European version has only one
0.)
Slide 7
US Roulette Wheel: Expected Value of a $1 bet on a single
number n Let y be your winnings resulting from a $1 bet on a single
number; y has 2 possible values y-135 p(y)37/381/38 n E(y)=
-1(37/38)+35(1/38)= -.05 n So on average the house wins 5 cents on
every such bet. A fair game would have E(y)=0. n The roulette
wheels are spinning 24/7, winning big $$ for the house, resulting
in
Slide 8
Slide 9
Variance and Standard Deviation n Measure spread around the
middle, where the middle is measured by
Slide 10
Variance Example Toss a fair die once. Let y be the number of
dots on upper face. y123456 p(y)1/6 Recall = 3.5
Slide 11
V(y) Example 2: Green Mountain Lottery y$0$500 p(y)0.9990.001
Recall =.50
Slide 12
Estimators for , 2, n s 2 average squared deviation from the
middle n Automate these calculations n Examples
Slide 13
Linear Transformations of Random Variables and Sample
Statistics n Random variable y with E(y) and V(y) n Lin trans
y*=a+by, what is E(y*) and V(y*) in terms of original E(y) and
V(y)? n Data y 1, y 2, , y n with mean y and standard deviation s n
Lin trans y* = a + by; new data y 1 *, y 2 *, , y n *; what is y*
and s* in terms of y and s
Slide 14
n E(y*)=E(a+by) = a + bE(y) n V(y*)=V(a+by) = b 2 V(y) n
SD(y*)=SD(a+by) =|b|SD(y) n y* = a + by n s* 2 = b 2 s 2 n s* = b s
Linear Transformations Rules for E(y*), V(y*) and SD(y*) Rules for
y*, s* 2, and s*
Slide 15
Expected Value and Standard Deviation of Linear Transformation
a + by Let y=number of repairs a new computer needs each year.
Suppose E(y)= 0.20 and SD(y)=0.55 The service contract for the
computer offers unlimited repairs for $100 per year plus a $25
service charge for each repair. What are the mean and standard
deviation of the yearly cost of the service contract? Cost = $100 +
$25y E(cost) = E($100+$25y)=$100+$25E(y)=$100+$25*0.20= =
$100+$5=$105 SD(cost)=SD($100+$25y)=SD($25y)=$25*SD(y)=$25*0.55=
=$13.75
Slide 16
Addition and Subtraction Rules for Random Variables n E(X+Y) =
E(X) + E(Y); n E(X-Y) = E(X) - E(Y) n When X and Y are independent
random variables: 1. Var(X+Y)=Var(X)+Var(Y) 2. SD(X+Y)= SDs do not
add: SD(X+Y) SD(X)+SD(Y) 3. Var(XY)=Var(X)+Var(Y) 4. SD(X Y)= SDs
do not subtract: SD(XY) SD(X)SD(Y) SD(XY) SD(X)+SD(Y)
Slide 17
Example: rvs NOT independent n X=number of hours a randomly
selected student from our class slept between noon yesterday and
noon today. n Y=number of hours the same randomly selected student
from our class was awake between noon yesterday and noon today. Y =
24 X. n What are the expected value and variance of the total hours
that a student is asleep and awake between noon yesterday and noon
today? n Total hours that a student is asleep and awake between
noon yesterday and noon today = X+Y n E(X+Y) = E(X+24-X) = E(24) =
24 n Var(X+Y) = Var(X+24-X) = Var(24) = 0. n We don't add Var(X)
and Var(Y) since X and Y are not independent.
Slide 18
a2a2 c 2 =a 2 +b 2 b2b2 Pythagorean Theorem of Statistics for
Independent X and Y a b c a 2 + b 2 = c 2 Var(X) Var(Y) Var(X+Y)
SD(X) SD(Y) SD(X+Y) Var(X)+Var(Y)=Var(X+Y) a + b c SD(X)+SD(Y)
SD(X+Y)
Slide 19
9 25=9+16 16 Pythagorean Theorem of Statistics for Independent
X and Y 3 4 5 3 2 + 4 2 = 5 2 Var(X) Var(Y) Var(X+Y) SD(X) SD(Y)
SD(X+Y) Var(X)+Var(Y)=Var(X+Y) 3 + 4 5 SD(X)+SD(Y) SD(X+Y)
Slide 20
Example: meal plans n Regular plan: X = daily amount spent n
E(X) = $13.50, SD(X) = $7 n Expected value and stan. dev. of total
spent in 2 consecutive days? (assume independent) n E(X 1 +X 2
)=E(X 1 )+E(X 2 )=$13.50+$13.50=$27 SD(X 1 + X 2 ) SD(X 1 )+SD(X 2
) = $7+$7=$14
Slide 21
Example: meal plans (cont.) n Jumbo plan for football players
Y=daily amount spent n E(Y) = $24.75, SD(Y) = $9.50 n Amount by
which football players spending exceeds regular student spending is
Y-X n E(Y-X)=E(Y)E(X)=$24.75-$13.50=$11.25 SD(Y X) SD(Y) SD(X) =
$9.50 $7=$2.50
Slide 22
For random variables, X+X2X n Let X be the annual payout on a
life insurance policy. From mortality tables E(X)=$200 and
SD(X)=$3,867. 1) If the payout amounts are doubled, what are the
new expected value and standard deviation? Double payout is 2X.
E(2X)=2E(X)=2*$200=$400 SD(2X)=2SD(X)=2*$3,867=$7,734 2) Suppose
insurance policies are sold to 2 people. The annual payouts are X 1
and X 2. Assume the 2 people behave independently. What are the
expected value and standard deviation of the total payout? E(X 1 +
X 2 )=E(X 1 ) + E(X 2 ) = $200 + $200 = $400 The risk to the
insurance co. when doubling the payout (2X) is not the same as the
risk when selling policies to 2 people.
Slide 23
Estimator of population mean n y will vary from sample to
sample n What are the characteristics of this sample-to- sample
behavior?
Slide 24
Numerical Summary of Sampling Distribution of y Unbiased: a
statistic is unbiased if it has expected value equal to the
population parameter.
Slide 25
Numerical Summary of Sampling Distribution of y
Slide 26
Standard Error Standard error - square root of the estimated
variance of a statistic important building block for statistical
inference
Slide 27
Shape? n We have numerical summaries of the sampling
distribution of y n What about the shape of the sampling
distribution of y ?
Slide 28
THE CENTRAL LIMIT THEOREM The World is Normal Theorem
Slide 29
The Central Limit Theorem (for the sample mean y) n If a random
sample of n observations is selected from a population (any
population), then when n is sufficiently large, the sampling
distribution of y will be approximately normal. (The larger the
sample size, the better will be the normal approximation to the
sampling distribution of y.)
Slide 30
The Importance of the Central Limit Theorem n When we select
simple random samples of size n, the sample means we find will vary
from sample to sample. We can model the distribution of these
sample means with a probability model that is Shape of population
is irrelevant
Slide 31
Estimating the population total
Slide 32
Expected value
Slide 33
Estimating the population total Variance, standard deviation,
standard error
Slide 34
Finite population case Example: sampling w/ replacement to
estimate
Slide 35
Finite population case Example: sampling w/ replacement to
estimate SampleProb of Sample V( ) {1, 2}.021525.0 {1,
3}.0835/41.5625 {1, 4}.08100 {2, 3}.0855/439.0625 {2, 4}.081525.0
{3, 4}.3235/41.5625 {1, 1}.01100 {2, 2}.01200 {3, 3}.1615/20 {4,
4}.16100
Slide 36
Finite population case Example: sampling w/ replacement to
estimate From the table:
Slide 37
Finite population case Example: sampling w/ replacement to
estimate
Slide 38
Finite population case Example: sampling w/ replacement to
estimate Example Summary
Slide 39
Finite population case Sampling w/ replacement to estimate pop.
total In general
Slide 40
Finite population case Sampling w/ replacement to estimate pop.
total
Slide 41
Finite population case Sampling w/ replacement to estimate pop.
total In reality, do not know value of y i for every item in the
population. BUT can choose i proportional to a known measurement
highly correlated with y i.
Slide 42
Finite population case Sampling w/ replacement to estimate pop.
total Example: want to estimate total number of job openings in a
city by sampling industrial firms. Many small firms employ few
workers; A few large firms employ many workers; Large firms
influence number of job openings; Large firms should have greater
chance of being in sample to improve estimate of total openings.
Firms can be sampled with probabilities proportional to the firms
total work force, which should be correlated to the firms job
openings.
Slide 43
Finite population case Sampling without replacement to estimate
pop. total Thus far we have assumed a population that does not
change when the first item is selected, that is, we sampled with
replacement. Example: population {1, 2, 3, 4}; n=2, suppose equally
likely. Prob. of selecting 3 on first draw is . Prob. of selecting
3 on second draw depends on first draw (probability is 0 or 1/3)
When sampling without replacement this is not true
Slide 44
Finite population case Sampling without replacement to estimate
pop. total Worksheet