prev

next

of 44

View

230Download

0

Tags:

Embed Size (px)

Statistical inference: confidence intervals and hypothesis testing

ObjectiveThe objective of this session isInference statisticSampling theoryEstimate and confidence intervalsHypothesis testing

Statistical analysisDescriptivecalculate various type of descriptive statistics in order to summarize certain quality of the data

Inferentialuse information gained from the descriptive statistics of sample data to generalize to the characteristics of the whole population

Inferential statistic application2 broad areasEstimationcreate confidence intervals to estimate the true population parameter

Hypothesis testing test the hypotheses that the population parameter has a specified range

Population & Sample mean:standard deviation:

Sampling theoryWhen working with the samples of data we have to rely on sampling theory to give us the probability distribution pertaining to the particular sample statistics

This probability distribution is known as the sampling distribution

Sampling distributionsAssume there is a population Population size N=4Random variable, X, is age of individualsValues of X: 18, 20, 22, 24 measured in years

ABCD

Sampling distributionsSummary measures for the Population Distribution

.3.2.1 0 A B C D (18) (20) (22) (24) Population mean DistributionP(X)

Sampling distributionsSummary measures of sampling distribution

Properties of summary measures Sampling distribution of the sample arithmetic mean

Sampling distribution of the standard deviation of the sample means

Estimation and confidence intervalsEstimation of the population parameters:point estimatesconfidence intervals or interval estimators

Confidence intervals for:MeansVariance

Large or Small samples ???

Confidence intervals for meanslarge samples (n >= 30)apply Z-distribution

Probability distributionconfidence interval

Confidence intervals for meanslarge samples (n >= 30)From the normally distributed variable, 95% of the observations will be plus or minus 1.96 standard deviations of the mean

Confidence intervals for meanslarge samples (n >= 30)The confident interval is given as

95% confidence interval-1.96 SE+1.96 SEProbability distribution2.5% in tail2.5% in tail

Confidence intervals for meanslarge samples (n >= 30)

95% confidence interval-1.96 SE+1.96 SEProbability distribution2.5% in tail2.5% in tail

Confidence intervals for meanslarge samples (n >= 30)Thus, we can state that:the sample mean will lie within an interval plus or minus 1.95 standard errors of the population mean 95% of the time

Confidence intervals for meanslarge samples (n >= 30)Examplewe have data on 60 monthly observations of the returns to the SET 100 index. The sample mean monthly return is 1.125% with a standard deviation of 2.5%. What is the 95% confidence interval mean ???

Confidence intervals for meanslarge samples (n >= 30)Example (contd)Standard error is calculated as

the confidence interval would be

The probability statement would be

Confidence intervals for meanslarge samples (n >= 30)Example (contd)The probability statement would be

How does the analyst use this information ???

Confidence intervals for meansWhat about small samples (n < 30)apply t-distribution

Probability distribution

Confidence intervals for means What about small sample ??? (n < 30)Apply t-distribution The confidence interval becomes

The probability statement pertaining to this confidence interval is

Confidence intervals for means ExampleFrom 20 observations, the sample mean is calculated as 4.5%. The sample standard deviation is 5%. At the 95% level of confidence: the confidence interval is the probability statement is

Confidence intervals for variances Apply a distributionThe confidence interval is given as

The probability statement pertaining to this confidence interval is

Confidence intervals for variances ExampleFrom a sample of 30 monthly observations the variance of the FTSE 100 index is 0.0225. With n-1 = 29 degrees of freedom (leaving 2.5% level of significant in each tail)the confidence interval is the probability statement is

Hypothesis testing 2 Broad approachesClassical approachP-value approach

is an assumption about the value of a population parameter of the probability distribution under consideration

Hypothesis testing When testing, 2 hypotheses are establishedthe null hypothesisthe alternative hypothesis

The exact formulation of the hypothesis depends upon what we are trying to establishe.g. we wish to know whether or not a population parameter, , has a value of

Hypothesis testing How about we wish to know whether or not a population parameter, , is greater than a given figure , the hypothesis would then be

And if we wish to know whether or not a population parameter is greater than a given figure , the hypothesis would then be

The standardized test statistic In hypothesis testing we have to standardizing the test statistic so that the meaningful comparison can be made with theStandard normal (z-distribution) t-distribution distribution

The hypothesis test may be One-tailed testTwo-tailed test

MEANVARIANCE

Hypothesis test of the population mean Two-tailed test of the meanSet up the hypotheses as

Decide on the level of significance for the test (10, 5, 1% level etc.) and establish 5, 2.5, 0.5% in each tailSet the value of in the null hypothesis Identify the appropriate critical value of z (or t) from the tables (reflect the percentages in the tails according to the level of significance chosen)

Hypothesis test of the population mean Two-tailed test of the meanApplying the following decision rule:

Accept H0 if

Reject H0 if otherwise

Hypothesis test of the population meanExampleConsider a test of whether or not the mean of a portfolio managers monthly returns of 2.3% is statistically significantly different from the industry average of 2.4%. (from 36 observations with a standard deviation of 1.7%)

Hypothesis test of the population meanExampleAn analyst claims that the average annual rate of return generated by a technical stock selection service is 15% and recommends that his firm use the services as an input for its research product. The analysts supervisor is skeptical of this claim and decides to test its accuracy by randomly selecting 16 stocks covered by the service and computing the rate of return that would have been earned by following the services recommendations with regards to them over the previous 10-year period. The result of this sample are as follows:The average annual rate of return produced by following the services advice on the 16 sample stocks over the past 10 years was 11%The standard deviation in these sample results was 9%

Determine whether or not the analysts claim should be accepted or rejected at the 5% level of significant ???

Hypothesis test of the population mean One-tailed test of the mean (Right-tailed tests)Set up the hypotheses as

Applying the following decision rule:Accept H0 if

Reject H0 if

Hypothesis test of the population meanExampleIf we wish to test that the mean monthly return on the FTSE 100 index for a given period is more than 1.2. From 60 observations we calculate the mean as 1.25% and the standard deviation as 2.5%.

Hypothesis test of the population meanExampleWe wish to test that the mean monthly return on the S&P500 index is less than 1.30%. Assume also that the mean return from 75 observations is 1.18%, with a standard deviation of 2.2%.

Hypothesis test of the population mean Two-tailed test Applying the following decision rule:

Accept H0 if Reject H0 if otherwise

One-tailed test Applying the following decision rule:Accept H0 if Reject H0 if

Left or right tailed test ???How bout the other ???

Hypothesis testing of the varianceTwo-tailed testThe standardized test statistic for the population variance is

This standardized test statistic has a distribution

Hypothesis testing of the varianceExampleIf we wish to test the variance of share B is below 25. The sample variance is 23 and the number of observation is 40

The p-value method of hypothesis testingThe p-value is the lowest level of significance at which the null hypothesis is rejectedIf the p-value the level of significance ()accept null hypothesisIf the p-value < the level of significance ()reject null hypothesis

Calculation the p-valueIf we wish to find an investment give at least 13.2%. Assume that the mean annualized monthly return of a given bond index is 14.4% and the sample standard deviation of those return is 2.915%, there were 30 observations an the returns are normally distributed.

Calculation the p-value

The test statistic is:

With degree of freedom = 29 a t-value of 2.045 leaves 2.5% in the taila t-value of 2.462 leaves 1% in the tail

Calculation the p-valueCalculate p-value from interpolation

P-value = 0.025 (0.50 x (0.025 0.01) = 0.0175 = 1.75%

P-value (1.75%) < (5%), thus reject null hypothesis

Conclusion Meaning of statistical inferenceSampling theory Application of statistical inference Confidence intervalsEstimationHypothesis testin