7
Chi-Square Test The Chi-Square Test procedure tabulates a variable into categories and computes a chi- square statistic. This goodness-of-fit test compares the observed and expected frequencies in each category to test either that all categories contain the same proportion of values or that each category contains a user-specified proportion of values. Examples. The chi-square test could be used to determine if a bag of jelly beans contains equal proportions of blue, brown, green, orange, red, and yellow candies. You could also test to see if a bag of jelly beans contains 5% blue, 30% brown, 10% green, 20% orange, 15% red, and 15% yellow candies. Statistics. Mean, standard deviation, minimum, maximum, and quartiles. The number and the percentage of nonmissing and missing cases, the number of cases observed and expected for each category, residuals, and the chi-square statistic. Calculation for Chi-Squire Value from Cros s Tabulation: (f 0 – f e )) 2 X 2 =------------ (Chi-Square) f e Where: f 0 is the expected frequency in category f e is observed frequency in each category 882-271 950-400 1793-739 = ---------- +------------+…….. +----------- 271 400 793 =38.71 (Check the value for c hi-square distributions table from any statistic al book) level of signi ficant with 6 de gree of freedom. Years Complaints Total Received solved April-06 to March-07 611 271 882 69.3% 30.7% 100.0% April-05 to March-06 550 400 950 57.9% 42.1% 100.0% April-04 to March05 702 400 1,102

Calculation for Chi-Sqr, T-test

Embed Size (px)

Citation preview

Page 1: Calculation for Chi-Sqr, T-test

8/6/2019 Calculation for Chi-Sqr, T-test

http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 1/7

Chi-Square Test

The Chi-Square Test procedure tabulates a variable into categories and computes a chi-square statistic. This goodness-of-fit test compares the observed and expected frequencies

in each category to test either that all categories contain the same proportion of values or that each category contains a user-specified proportion of values.

Examples. The chi-square test could be used to determine if a bag of jelly beans containsequal proportions of blue, brown, green, orange, red, and yellow candies. You could also

test to see if a bag of jelly beans contains 5% blue, 30% brown, 10% green, 20% orange,

15% red, and 15% yellow candies.

Statistics. Mean, standard deviation, minimum, maximum, and quartiles. The number andthe percentage of nonmissing and missing cases, the number of cases observed and

expected for each category, residuals, and the chi-square statistic.

Calculation for Chi-Squire Value from Cross Tabulation:

(f 0 – f e))2

X2 =∑------------ (Chi-Square)

f eWhere:

f 0 is the expected frequency in category

f e is observed frequency in each category

882-271 950-400 1793-739

= ---------- +------------+…….. +-----------

271 400 793

=38.71 (Check the value for chi-square distributions table from

any statistical book) level of significant with 6 degree of freedom.

 

Years Complaints

TotalReceived solved

  April-06 to March-07 611 271 882

  69.3% 30.7% 100.0%

  April-05 to March-06 550 400 950

  57.9% 42.1% 100.0%

  April-04 to March05 702 400 1,102

Page 2: Calculation for Chi-Sqr, T-test

8/6/2019 Calculation for Chi-Sqr, T-test

http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 2/7

  63.7% 36.3% 100.0%

 April 1996 to March 1997 1,775 1,153 2,928

  60.6% 39.4% 100.0%

  April 95 to March 1996 1,772 1,100 2,872

 61.7% 38.3% 100.0%

 April 1994 to March 1995 1,355 880 2,235

  60.6% 39.4% 100.0%

  April 1998 to March 1999 1,055 738 1,793

  58.8% 41.2% 100.0%

  Total 7,820 4,942 12,762

  61.3% 38.7% 100.0%

Value df  Asymp. Sig.

(2-sided)

Pearson Chi-Square 36.718 6 0.000

Paired-Samples T Test: Related

Paired-Samples T Test

The Paired-Samples T Test procedure compares the means of two variables for a single

group. It computes the differences between values of the two variables for each case andtests whether the average differs from 0.

Example. In a study on high blood pressure, all patients are measured at the beginning of the study, given a treatment, and measured again. Thus, each subject has two measures,

often called before and after measures. An alternative design for which this test is used isa matched-pairs or case-control study. Here, each record in the data file contains the

response for the patient and also for his or her matched control subject. In a blood

 pressure study, patients and controls might be matched by age (a 75-year-old patient witha 75-year-old control group member).

Statistics. For each variable: mean, sample size, standard deviation, and standard error of 

the mean. For each pair of variables: correlation, average difference in means, t test, and

confidence interval for mean difference (you can specify the confidence level). Standard

deviation and standard error of the mean difference.

Procedures

To test a sample mean from one group of cases against that from another group of cases,use the Independent-Samples T Test. If you want to compare a sample mean against a

constant value, use the One-Sample T Test. If the data in the test variable are not

Page 3: Calculation for Chi-Sqr, T-test

8/6/2019 Calculation for Chi-Sqr, T-test

http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 3/7

quantitative, but ordered, or are not normally distributed, use the Wilcoxon signed-rank 

test (select 2 Related Samples from the Nonparametric Tests submenu).

Data. For each paired test, specify two quantitative variables (interval- or ratio-level of measurement). For a matched- pairs or case-control study, the response for each test

subject and its matched control subject must be in the same case in the data file.

Assumptions. Observations for each pair should be made under the same conditions. The

mean differences should be normally distributed. Variances of each variable can be equalor unequal.

Independent-Samples Group T Test

The Independent-Samples T Test procedure compares means for two groups of cases.

Ideally, for this test, the subjects should be randomly assigned to two groups, so that any

difference in response is due to the treatment (or lack of treatment) and not to other factors. This is not the case if you compare average income for males and females. A

 person is not randomly assigned to be a male or female. In such situations, you shouldensure that differences in other factors are not masking or enhancing a significant

difference in means. Differences in average income may be influenced by factors such as

education and not by sex alone.

Example. Patients with high blood pressure are randomly assigned to a placebo group anda treatment group. The placebo subjects receive an inactive pill and the treatment subjects

receive a new drug that is expected to lower blood pressure. After treating the subjects for 

two months, the two-sample t test is used to compare the average blood pressures for the

 placebo group and the treatment group. Each patient is measured once and belongs to onegroup.

Statistics. For each variable: sample size, mean, standard deviation, and standard error of 

the mean. For the difference in means: mean, standard error, and confidence interval (youcan specify the confidence level). Tests: Levene's test for equality of variances, and both

 pooled- and separate-variances t tests for equality of means.

Correlation Analysis

• The sample correlation coefficient (r ) measures the degree of linearity in the relationship between X and Y .

-1 < r < +1

• r = 0 indicates no linear relationship

Page 4: Calculation for Chi-Sqr, T-test

8/6/2019 Calculation for Chi-Sqr, T-test

http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 4/7

• In Excel, use =CORREL(array1,array2),

where array1 is the range for  X and array2 is the range for Y . 

Paired Sample T-Test

Paired sample t-test is a statistical technique that is used to compare two population

means in the case of two samples that are correlated. Paired sample t-test is used in‘before after’ studies, or when the samples are the matched pairs, or the case is a control

study. For example, if we give training to a company employee and we want to knowwhether or not the training had any impact on the efficiency of the employee, we could

use the paired sample test. We collect data from the employee on a seven scale rating,

 before the training and after the training. By using the paired sample t-test, we canstatistically conclude whether or not training has improved the efficiency of the

employee. In medicine, by using the paired sample t-test, we can figure out whether or 

not a particular medicine will cure the illness.

 Assumptions in Paired sample t-test:

1.The first assumption in the paired sample t–test is that only the matched pair can be

used to perform the paired sample t-test.2.In the paired sample t-test, normal distributions are assumed.

3.Variance in paired sample t-test: In a paired sample t-test, it is assumed that the

variance of two samples is same.4.Independence of observation in paired sample t-test: In a paired sample t-test,

observations must be independent of each other.

Page 5: Calculation for Chi-Sqr, T-test

8/6/2019 Calculation for Chi-Sqr, T-test

http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 5/7

Steps in the calculation of paired sample t-test:

1.Set up hypothesis: To calculate the paired sample t-test, first we have to set up thehypothesis. In a paired sample t-test, we set up two hypotheses. The first is null

hypothesis, which assumes that the mean of two paired samples are equal. The second

hypothesis in the paired sample t-test will be an alternative hypothesis, which assumesthat the means of two paired samples are not equal.

2.Select the level of significance: In paired sample t-test, after making the hypothesis,

we choose the level of significance. In most of the cases in the paired sample t-test,significance level is 5%, but in medicine, the significance level is set up at 1%.

3.Calculate the parameter: To calculate the parameter we will use the following

formula for the paired sample t-test:

Where d bar is the mean difference between two samples, s² is the sample variance, n is

the sample size and t is a paired sample t-test with n-1 degrees of freedom.

An alternate formula for paired sample t-test is:

4.Testing of hypothesis or decision making: After calculating the parameter, we will

compare the calculated value with the table value. If the calculated value is greater than

the table value, then we will reject the null hypothesis for the paired sample t-test. If thecalculated value is less than the table value, then we will accept the null hypothesis in the

 paired sample t-test and say that there is no significant mean difference between the two

 paired samples in the paired sample t-test.

Paired sample t-test in SPSS:

Most statistical software performs this paired sample t-test. In SPSS, paired sample t-test

is available under “analysis” in the menu option, and then in the “compare means”option. As we click on “paired sample t-test,” the following window will appear in SPSS:

Page 6: Calculation for Chi-Sqr, T-test

8/6/2019 Calculation for Chi-Sqr, T-test

http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 6/7

 Now, from the left side, we will select the first paired variable and drag it into the pairedvariables option, variable1, and then select the second paired variable and drag it in to the

second variable place. From the “option” menu, we will select the “confidence interval”

and then click on the “ok” button. After clicking the ok button, the result window willshow the result for the paired sample t-test. The first two tables in SPSS for the paired

sample t-test will show the descriptive statistics and the correlation between the paired

variable. The next table will show the value of the paired sample t-test associated with

their probability value. The table will look like the following table:

This table shows a paired sample t-test value associated with the p value and other 

statistics. By using the p value, we can make decisions about the pair of the sample mean.

For instance, this table shows the P value for the pair BI and BI2— and their probabilityvalue is .896, which is greater than the significance level at 5%. In this example of the

 paired sample t-test, paired sample means are insignificant, or the mean of the two paired

samples are equal.

he most familiar measure of dependence between two quantities is the Pearson product-moment correlation coefficient, or "Pearson's correlation." It is obtained by dividing the

covariance of the two variables by the product of their standard deviations. Karl Pearson 

developed the coefficient from a similar but slightly different idea by Francis Galton.[4]

The population correlation coefficient ρ X,Y between two random variables  X and Y withexpected values μ X and μY and standard deviations σ X and σY  is defined as:

Page 7: Calculation for Chi-Sqr, T-test

8/6/2019 Calculation for Chi-Sqr, T-test

http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 7/7

where E is the expected value operator, cov means covariance, and, corr a widely used

alternative notation for Pearson's correlation.

The Pearson correlation is defined only if both of the standard deviations are finite and both of them are nonzero. It is a corollary of the Cauchy–Schwarz inequality that the

correlation cannot exceed 1 in absolute value. The correlation coefficient is symmetric:corr( X ,Y ) = corr(Y , X ).

The Pearson correlation is +1 in the case of a perfect positive (increasing) linear relationship (correlation), −1 in the case of a perfect decreasing (negative) linear 

relationship (anticorrelation) [5], and some value between −1 and 1 in all other cases,

indicating the degree of  linear dependence between the variables. As it approaches zero

there is less of a relationship (closer to uncorrelated). The closer the coefficient is toeither −1 or 1, the stronger the correlation between the variables.

If the variables are independent, Pearson's correlation coefficient is 0, but the converse is

not true because the correlation coefficient detects only linear dependencies between twovariables. For example, suppose the random variable X is symmetrically distributed about

zero, and Y = X 2. Then Y is completely determined by X , so that X and Y are perfectly

dependent, but their correlation is zero; they are uncorrelated. However, in the special

case when X and Y are jointly normal, uncorrelatedness is equivalent to independence.

If we have a series of n measurements of  X and Y written as xi and yi where i = 1, 2, ..., n,

then the sample correlation coefficient can be used to estimate the population Pearson

correlation r between X and Y . The sample correlation coefficient is written

where x and y are the sample means of  X and Y , and s x and s y are the sample standard

deviations of  X and Y .

This can also be written as: