44
Chapter 3 Selected Basic Concepts in Statistics Expected Value, Variance, Standard Deviation Numerical summaries of selected statistics Sampling distributions

Chapter 3 Selected Basic Concepts in Statistics

  • Upload
    fritz

  • View
    65

  • Download
    0

Embed Size (px)

DESCRIPTION

Chapter 3 Selected Basic Concepts in Statistics. Expected Value, Variance, Standard Deviation Numerical summaries of selected statistics Sampling distributions. Expected Value. Weighted average Not the value of y you “expect”; a long-run average. E(y) Example 1. - PowerPoint PPT Presentation

Citation preview

Chap3 Basic Concepts in Stats

Chapter 3Selected Basic Concepts in StatisticsExpected Value, Variance, Standard DeviationNumerical summaries of selected statisticsSampling distributionsExpected Value

Weighted averageNot the value of y you expect; a long-run average

E(y) Example 1

Toss a fair die once. Let y be the number of dots on upper face. y123456p(y)1/61/61/61/61/61/6E(y) Example 2: GreenMountain Lottery

Choose 3 digits between 0 and 9. Repeats allowed, order of digits counts. If your 3-digit number is selected, you win $500. Let y be your winnings (assume ticket cost $0) y$0$500p(y)0.9990.001

US Roulette Wheel and TableThe roulette wheel has alternating black and red slots numbered 1 through 36.There are also 2 green slots numbered 0 and 00.A bet on any one of the 38 numbers (1-36, 0, or 00) pays odds of 35:1; that is . . .If you bet $1 on the winning number, you receive $36, so your winnings are $35

American Roulette 0 - 00(The European version has only one 0.)

US Roulette Wheel: Expected Value of a $1 bet on a single number Let y be your winnings resulting from a $1 bet on a single number; y has 2 possible valuesy-135p(y)37/381/38E(y)= -1(37/38)+35(1/38)= -.05So on average the house wins 5 cents on every such bet. A fair game would have E(y)=0.The roulette wheels are spinning 24/7, winning big $$ for the house, resulting in

Variance and Standard Deviation

Measure spread around the middle, where the middle is measured by Variance Example

Toss a fair die once. Let y be the number of dots on upper face. y123456p(y)1/61/61/61/61/61/6Recall = 3.5V(y) Example 2: GreenMountain Lottery

y$0$500p(y)0.9990.001

Recall = .50Estimators for , 2,

s2 average squared deviation from the middleAutomate these calculationsExamples

Linear Transformations of Random Variables and Sample StatisticsRandom variable y with E(y) and V(y)

Lin trans y*=a+by, what is E(y*) and V(y*) in terms of original E(y) and V(y)?

Data y1, y2, , yn with mean y and standard deviation s

Lin trans y* = a + by; new data y1*, y2*, , yn*; what is y* and s* in terms of y and sE(y*)=E(a+by)= a + bE(y)

V(y*)=V(a+by) = b2V(y)

SD(y*)=SD(a+by) =|b|SD(y)

y* = a + by

s*2 = b2s2

s* = bsLinear TransformationsRules for E(y*), V(y*) and SD(y*)Rules for y*, s*2 , and s*Expected Value and Standard Deviation of Linear Transformation a + byLet y=number of repairs a new computer needs each year. Suppose E(y)= 0.20 and SD(y)=0.55The service contract for the computer offers unlimited repairs for $100 per year plus a $25 service charge for each repair.What are the mean and standard deviation of the yearly cost of the service contract?Cost = $100 + $25yE(cost) = E($100+$25y)=$100+$25E(y)=$100+$25*0.20== $100+$5=$105SD(cost)=SD($100+$25y)=SD($25y)=$25*SD(y)=$25*0.55==$13.75

Addition and Subtraction Rules for Random VariablesE(X+Y) = E(X) + E(Y); E(X-Y) = E(X) - E(Y)

When X and Y are independent random variables:Var(X+Y)=Var(X)+Var(Y)SD(X+Y)=SDs do not add:SD(X+Y) SD(X)+SD(Y)Var(XY)=Var(X)+Var(Y)SD(X Y)=SDs do not subtract:SD(XY) SD(X)SD(Y)SD(XY) SD(X)+SD(Y)

Example: rvs NOT independentX=number of hours a randomly selected student from our class slept between noon yesterday and noon today.Y=number of hours the same randomly selected student from our class was awake between noon yesterday and noon today. Y = 24 X.What are the expected value and variance of the total hours that a student is asleep and awake between noon yesterday and noon today?Total hours that a student is asleep and awake between noon yesterday and noon today = X+YE(X+Y) = E(X+24-X) = E(24) = 24Var(X+Y) = Var(X+24-X) = Var(24) = 0.We don't add Var(X) and Var(Y) since X and Y are not independent.

a2c2=a2+b2b2Pythagorean Theorem of Statistics for Independent X and Yabca2 + b2 = c2Var(X)Var(Y)Var(X+Y)SD(X)SD(Y)SD(X+Y)Var(X)+Var(Y)=Var(X+Y)a + b cSD(X)+SD(Y) SD(X+Y)

925=9+1616Pythagorean Theorem of Statistics for Independent X and Y34532 + 42 = 52Var(X)Var(Y)Var(X+Y)SD(X)SD(Y)SD(X+Y)Var(X)+Var(Y)=Var(X+Y)3 + 4 5SD(X)+SD(Y) SD(X+Y)

Example: meal plansRegular plan: X = daily amount spentE(X) = $13.50, SD(X) = $7Expected value and stan. dev. of total spent in 2 consecutive days? (assume independent)E(X1+X2)=E(X1)+E(X2)=$13.50+$13.50=$27

SD(X1 + X2) SD(X1)+SD(X2) = $7+$7=$14Example: meal plans (cont.)Jumbo plan for football players Y=daily amount spentE(Y) = $24.75, SD(Y) = $9.50Amount by which football players spending exceeds regular student spending is Y-XE(Y-X)=E(Y)E(X)=$24.75-$13.50=$11.25

SD(Y X) SD(Y) SD(X) = $9.50 $7=$2.50For random variables, X+X2XLet X be the annual payout on a life insurance policy. From mortality tables E(X)=$200 and SD(X)=$3,867.If the payout amounts are doubled, what are the new expected value and standard deviation?Double payout is 2X. E(2X)=2E(X)=2*$200=$400SD(2X)=2SD(X)=2*$3,867=$7,734Suppose insurance policies are sold to 2 people. The annual payouts are X1 and X2. Assume the 2 people behave independently. What are the expected value and standard deviation of the total payout?E(X1 + X2)=E(X1) + E(X2) = $200 + $200 = $400

The risk to the insurance co. when doubling the payout (2X) is not the same as the risk when selling policies to 2 people.Estimator of population mean

y will vary from sample to sampleWhat are the characteristics of this sample-to-sample behavior?Numerical Summary of Sampling Distribution of y

Unbiased: a statistic is unbiased if it has expected value equal to the population parameter.Numerical Summary of Sampling Distribution of y

Standard Error

Standard error - square root of the estimated variance of a statistic important building block for statistical inferenceShape?We have numerical summaries of the sampling distribution of yWhat about the shape of the sampling distribution of y ?THE CENTRAL LIMIT THEOREMThe World is Normal Theorem

The Central Limit Theorem(for the sample mean y)If a random sample of n observations is selected from a population (any population), then when n is sufficiently large, the sampling distribution of y will be approximately normal.(The larger the sample size, the better will be the normal approximation to the sampling distribution of y.)The Importance of the Central Limit TheoremWhen we select simple random samples of size n, the sample means we find will vary from sample to sample. We can model the distribution of these sample means with a probability model that is

Shape of population is irrelevantEstimating the population total

Estimating the population total Expected value

Estimating the population total Variance, standard deviation, standard error

Finite population caseExample: sampling w/ replacement to estimate

Finite population caseExample: sampling w/ replacement to estimate SampleProb of SampleV(){1, 2}.021525.0{1, 3}.0835/41.5625{1, 4}.08100{2, 3}.0855/439.0625{2, 4}.081525.0{3, 4}.3235/41.5625{1, 1}.01100{2, 2}.01200{3, 3}.1615/20{4, 4}.16100Finite population caseExample: sampling w/ replacement to estimate From the table:

Finite population caseExample: sampling w/ replacement to estimate

Finite population caseExample: sampling w/ replacement to estimate Example Summary

Finite population caseSampling w/ replacement to estimate pop. total In general

Finite population caseSampling w/ replacement to estimate pop. total

Finite population caseSampling w/ replacement to estimate pop. total

In reality, do not know value of yi for every item in the population.

BUT can choose i proportional to a known measurement highly correlated with yi .Finite population caseSampling w/ replacement to estimate pop. total

Finite population caseSampling without replacement to estimate pop. total

Thus far we have assumed a population that does not change when the first item is selected, that is, we sampled with replacement.Example: population {1, 2, 3, 4}; n=2, suppose equally likely.Prob. of selecting 3 on first draw is .Prob. of selecting 3 on second draw depends on first draw (probability is 0 or 1/3)When sampling without replacement this is not trueFinite population caseSampling without replacement to estimate pop. total

WorksheetEnd of Chapter 3