10
Question 01: Given that, n = 15 a) The mean for the number of cars per household is, Mean = xi n = ( 1+ 3+0 + 2+ 0 +2+2 +0+2 +2+3 + 2+ 1 +1+2 ) 15 = 1.533 The mean for the household disposable income is, Mean = xi n = ( 100 +100 +30 +50 +30 +30 +100 +30 +100 +50 +100+50 + 50+ 30 +50) 15 = $ 60000 For find out the median arrange the data into highest to lowest order then we find Car: 0,0,0,1,1,1,2,2,2,2,2,2,2,3,3. House Income: 30,30,30,30,30,50,50,50,50,50,100,100,100,100,100 The median for the number of car is 2 The median for the house hold disposable income is 50000 The mode for the number of car is 2 which is owned by the 7 house hold so it is the highest value in the data series. But the mode for the house hold disposable income can be 30000, 50000 or 100000 because every data is in the number of 5.

Stat Final

Embed Size (px)

DESCRIPTION

it is a statictic document

Citation preview

Page 1: Stat Final

Question 01:

Given that,

n = 15

a) The mean for the number of cars per household is,

Mean = ∑ xi

n

=(1+3+0+2+0+2+2+0+2+2+3+2+1+1+2)

15= 1.533

The mean for the household disposable income is,

Mean = ∑ xi

n

=(1 00+100+30+50+30+30+100+30+100+50+100+50+50+30+50)

15

= $ 60000

For find out the median arrange the data into highest to lowest order then we find

Car: 0,0,0,1,1,1,2,2,2,2,2,2,2,3,3.

House Income: 30,30,30,30,30,50,50,50,50,50,100,100,100,100,100

The median for the number of car is 2

The median for the house hold disposable income is 50000

The mode for the number of car is 2 which is owned by the 7 house hold so it is the highest value in the data series.

But the mode for the house hold disposable income can be 30000, 50000 or 100000 because every data is in the number of 5.

After analysis the data series we find mean, median and mode. I think the median represent the best job for describing the central tendency for each variable. Because in median the result shows the best central data and there is a tendency shows here that maximum of the data in the data series are very close to the result.

b) Range:

Page 2: Stat Final

For calculating the range I have to find out the highest value and the lowest value, here for number of car the highest and lowest values are 3 and 0. For household income the highest and lowest values are 100000 and 30000. So the range can be

Car = ( 0 – 3 )

Household = ( 30000 – 100000 )

Variance and standard deviation :

Here for number of car x = 1.533

Now,

Number of cars

xi x ̅� ̅ (xi-x ̅)̅� ̅ (xi-x ̅)̅� ̅^21 1.533 -0.533 0.2840893 1.533 1.467 2.1520890 1.533 -1.533 2.3500892 1.533 0.467 0.2180890 1.533 -1.533 2.3500892 1.533 0.467 0.2180892 1.533 0.467 0.2180890 1.533 -1.533 2.3500892 1.533 0.467 0.2180892 1.533 0.467 0.2180893 1.533 1.467 2.1520892 1.533 0.467 0.2180891 1.533 -0.533 0.2840891 1.533 -0.533 0.2840892 1.533 0.467 0.21808923 22.995 0.005 13.73334

Variance, s2 = ∑ (xi−x )2

n−1

= 13.73315−1

= 13.733

14

Page 3: Stat Final

= 0.9809

Standard deviation, s = √s2

= √0.9809

= 0.990

Here for household income x = $ 60000.

Now,

Household Income

xi x ̅� ̅ (xi-x ̅)̅� ̅ (xi-x ̅)̅� ̅^2

100 60 40 1600

100 60 40 1600

30 60 -30 900

50 60 -10 100

30 60 -30 900

30 60 -30 900

100 60 40 1600

30 60 -30 900

100 60 40 1600

50 60 -10 100

100 60 40 1600

50 60 -10 100

50 60 -10 100

30 60 -30 900

50 60 -10 100900 900 0 13000

Variance, s2 = ∑ (xi−x )2

n−1

= 1300015−1

= 13000

14

= 928.42

Page 4: Stat Final

Standard deviation, s = √s2

= √928.42

= $ 30.47 ( Thousand)

Here the sample size is very small so it is better to find out the 95% confidence interval by examining with t- test.

Now,

df = 15 – 1

= 14

For 95% confidence interval the population mean of number of car,

= x ± t α2(s

√n¿

=1.533 ± t 0.052

(0.99

√15¿

= 1.533 ± 2.145 * 0.2556169 [ from t table find the value of t ]

= 2.081 ( upper bound) or 0.985 ( lower bound)

Again,

For 95% confidence interval the population mean of household income,

= x ± t α2(s

√n¿

= 60 ± t 0.052

(30.47

√15¿

= 60 ± 2.145 * 7.867320171 [ from t table find the value of t ]

= 76.88 ( upper bound) or 43.13 ( lower bound )

Page 5: Stat Final

Broader study:

Given that,

Mean for disposable income is $ 60000

Sample size, n = 196

Population mean, µ = 42500

Standard deviation, σ = $ 3000

Here the population standard deviation is given so I can use z statistics to find out the confidence interval. Here z statistics can also used to test the hypothesis that the mean level of income in Denver suburb of Geness, Colorado is same as the population

= x ± z α2(σ

√n¿

= $ 60000 ± z 0.052

($3000

√196¿

= $ 60000 ± 1.96 * 214.2857 [ from z table find the z value ]

= 60420 ( upper bound ) or 59580 ( lower bound )

The hypothesis to be tested is that the mean income for Denver area equals to the overall population, Ho = µ = 42500, when, σ = $ 3000

Now, Z = x−µσ

√n

= $60000−$42500

3000

√196

= 81.67

So the null hypothesis can be rejected.

Page 6: Stat Final

Question 02:

A) Given that,

Sales = $20.065 + $6.062 R&D

R2 = 99.8%

SEE= 233.75

F= 8460.40

From the regression model with the sales revenue as a dependent variable (Y), and R&D expenditure as an independent variable (X) yield, which is sales= $20.065 + $6.062 R&D.

If there is 0 expenditure for R&D then the sales will $ 20.065. Here for R&D expenditure the estimated coefficient is $6.062 and for every $1 change of R&D expenditure the total sales will increase in $6.062.

Now, R2 = 99.8% which represent the co relation of determination. It also represent that R&D explain 99.8% of the variation of dependent variables Y (Sale revenue).

The R2 = 99.8% indicates the share of sales variation that can be explained by the variation in R&D expenditures. Note that F=8460.40 implying the variation in R&D spending explains a significant share of the total variation in firm sales. This suggest that R&D expenditures are a key determinant of sales in the computer software industry as one might expect.

The standard error of Y estimates or SEE = $ 233.75 and is the average amount of error considered in estimating the level of sales for any given level of R&D expenditure. If the error are normally distributed about the regression equation as would be true when large samples are analyzed. There are 95% probability that observation of the dependent variables will lie within the range from Yi – (1.96* SEE) to Yi + (1.96* SEE) or within the two standard errors of the estimate. The probability is 99% that the dependent variables will lie within the range from Yi – (2.576*SEE) to Yi + (2.576* SEE) or within the three standard errors of the estimate.

From the equation, considering the t statistics at the 95% confidence of interval the t value is 2.160 and at the 99% confidence of interval the t value is 3.012 where the df = 15-2= 13. That means, the actual sales Yt can be expected in the range from Yi – (2.160* 233.75) to Yi + (2.160* 233.75) or from Yi – 504.90 to Yi + 504.90 with the 95% confidence interval. Again for

Page 7: Stat Final

99% confidence interval the range is from Yi – (3.012* 233.75) to Yi + (3.012* 233.75) or from Yi – 704.055 to Yi + 704.055.

B) Given that,

Profits = $210.31 + $2.538 R&D

R2 = 99.3%

SEE= 201.30

F= 1999.90

From the regression model with the profits as a dependent variable (Y), and R&D expenditure as an independent variable (X) yield, which is Profits = $210.31 + $2.538 R&D.

If there is 0 expenditure for R&D then the net income will $ 201.31. Here for R&D expenditure the estimated coefficient is $2.538 and for every $1 change of R&D expenditure the net income will increase in $ 2.538.

Now, R2 = 99.3% which represent the co relation of determination. It also represent that R&D explain 99.3% of the variation of dependent variables Y (Profits).

The R2 = 99.3% indicates the share of sales variation that can be explained by the variation in R&D expenditures. Note that F=1999.90 implying the variation in R&D spending explains a significant share of the total variation in firm net income. This suggests, that R&D expenditures are a key determinant of net income in the computer software industry as one might expect.

The standard error of Y estimates or SEE = $ 201.30 and is the average amount of error considered in estimating the level of net income for any given level of R&D expenditure. If the error are normally distributed about the regression equation as would be true when large samples are analyzed. There are 95% probability that observation of the dependent variables will lie within the range from Yi – (1.96* SEE) to Yi + (1.96* SEE) or within the two standard errors of the estimate. The probability is 99% that the dependent variables will lie within the range from Yi – (2.576*SEE) to Yi + (2.576* SEE) or within the three standard errors of the estimate.

From the equation, considering the t statistics at the 95% confidence of interval the t value is 2.160 and at the 99% confidence of interval the t value is 3.012 where the df = 15-2= 13. That means, the actual sales Yt can be expected in the range from Yi – (2.160* 201.30) to Yi + (2.160* 201.30) or from Yi – 434.808 to Yi + 434.808 with the 95% confidence interval. Again for 99% confidence interval the range is from Yi – (3.012* 201.30) to Yi + (3.012* 201.30) or from Yi – 606.3156 to Yi + 606.3156.

Page 8: Stat Final

C)

The regression analysis shows that, there is a strong relationship with sales revenue and R&D expenditure and net profit and R&D expenditure. There is very insignificant change or difference in the variance in sales revenue and R&D expenditure and net profit and R&D expenditure. But here the co relation R2 shows that there is little bit strong relation in sales revenue and R&D expenditure than net profit and R&D expenditure.