MAP812_The Chi-Square Statistic_Samantha Ng

Embed Size (px)

Citation preview

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    1/32

    The Chi-Square Statistic

    2

    Done by:Ng Shi Ying Samantha

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    2/32

    Characteristics of the chi-square test

    The chi-square statistic is a non-parametric test used toevaluate hypotheses about the proportions or relationships

    that exist within populations

    Characteristics of variables Categorical (nominal) data (or at most, ordinal data)

    Observations are independent

    Two types of chi-square tests Chi-square test for Goodness-of-fit

    Chi-square test for Independence

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    3/32

    Chi-Square Test for Goodness-of-Fit

    Used to determine how well the obtained sample

    proportions fit the population proportions specified by

    the null hypothesis Observed frequencies (fo) vs. Expected frequencies (fe)

    Assumptions One categorical variable, with two or more categories

    A hypothesized proportion (equal or unequal)

    No more than 20% of expected frequencies have counts less

    than 5

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    4/32

    Chi-Square Test for Independence

    Used to determine if there is a relationship between

    two variables in a population

    Assumptions

    Two variables that are ordinal or nominal There are two or more categories in each variable

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    5/32

    Hypothesis Testing

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    6/32

    Steps for Hypothesis Testing

    1. State hypotheses and select alpha level

    2. Calculate degrees of freedom and locate critical region

    3. Calculate chi-square statistic

    4. State decision and conclusion

    Report results for chi-square

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    7/32

    Stating Hypotheses

    Test for Goodness-of-Fit

    i. No preference among categories Two-tailed test

    H0

    : p1

    = p2

    (population proportions are equal among the

    categories)

    H1: p1 p2 (population proportions are not equal among the

    categories)

    One-tailed test H0: p1 = p2 (population proportions are equal among the

    categories)

    H1: p1 > p2 (population proportion is greater in one of the

    categories)

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    8/32

    ii. No difference from known population

    Two-tailed test H0: p1 = p2 (sample and known population proportions are

    equal)

    H1: p1 p2 (sample and known population proportions are not

    equal) One-tailed test

    H0: p1 = p2 (sample and known population proportions are

    equal)

    H1: p1 > p2 (sample proportion is greater than known

    population proportion)

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    9/32

    Stating Hypotheses

    Test for Independence

    i. Variables are independent Two-tailed test

    H0: p1 = p2 (population proportions are equal among the

    categories)

    H1: p1 p2 (population proportions are not equal among the

    categories)

    One-tailed test

    H0: p1 = p2 (population proportions are equal among thecategories)

    H1: p1 > p2 (population proportion is greater in one of the

    categories)

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    10/32

    df and Critical Region

    Test for Goodness-of-fit

    df= C (no. of categories) -1

    Test for Independence

    df= (C [no. of columns] -1) x(R [no. of rows] -1)

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    11/32

    dfand Critical Region

    Critical region: Refer to table of critical values for

    chi-square

    e.g. Ifdf= 4, = .05:

    2critical = 9.49

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    12/32

    Calculating the chi-square statistic

    2statistic =

    Test for Goodness-of-fit:f

    o

    (observed frequency in a category) =po

    x no

    fe (expected frequency in a category) =pe x ne

    Test for Independence:f

    c(observed frequency in a column) =p

    cx n

    c

    fr (expected frequency in a row) =pr x nrfo (observed frequency in a category) =po x no

    fe (expected frequency in a category) = (fc xfr)/n

    (fofe)

    2

    fe

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    13/32

    State decision and conclusion

    If2statistic > 2critical , reject Ho and conclude that there

    are significant differences in the proportions

    Reporting results for chi-square

    E.g. The participants showed significant

    preferences among the four orientations for

    hanging the painting, 2 (3, n = 50) = 8.08,p < .05

    (Gravetter & Wallnau, 2011)

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    14/32

    Effect Size for Test of Independence

    2 x 2 matrix:

    = (2/n)

    > 2 x 2 matrix:

    V = *2/n(df)]

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    15/32

    Examples

    Example 1

    Samantha wants to know if girls prefer flowers or chocolates as aValentines Day gift. She hypothesizes that girls will prefer

    flowers as chocolates are fattening. She surveyed 500 girls to

    find out if her hypothesis is true, and obtained the following

    results:

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    16/32

    Example 1 (contd)

    How many variables are being tested? One Is it a nominal variable?

    Yes

    Are you testing how well the obtained

    sample proportions fit the population

    proportions specified by the null

    hypothesis?

    Yes

    Do more than 20% of

    expected frequencies have

    counts less than 5?

    Yes Use GOODNESS-OF-FIT TEST

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    17/32

    Examples

    Example 2

    Samantha wants to know if there are gender differences inpreferred colour (Pink vs. Blue). She hypothesizes that Males will

    prefer Blue while Females will prefer Pink. She surveyed 20

    males and 20 females and obtained the following results:

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    18/32

    Example 2 (contd)

    How many variables are being tested? Two Are they nominal variables?

    Yes

    Are you testing if there is a relationship

    between two variables in a population?Yes

    Are there two or more

    categories in each

    variable?

    Yes Use INDEPENDENCE TEST

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    19/32

    Using SPSS

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    20/32

    Test for Goodness-of Fit

    Example 1

    Samantha wants to know if girls prefer flowers or chocolates as aValentines Day gift. She hypothesizes that girls will prefer

    flowers as chocolates are fattening. She surveyed 500 girls to

    find out if her hypothesis is true, and obtained the following

    results:

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    21/32

    Steps for using SPSS:

    1. Create variables and code for gift type.

    2. Enter the data.

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    22/32

    3. Weight the cases

    (a) Click Data > Weight Cases... (b) Select the "Weight

    cases by" box andtransfer the "frequency"

    variable into the

    "Frequency Variable:"

    box. Click OK.

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    23/32

    4. Start analysis

    (a) Click Analyze > Nonparametric Tests (b) Transfer the "gift" variable into the

    "Test> Legacy Dialogs > Chi-square Variable List:" box. Keep the "Allcategories equal" option selected in

    the "Expected Values" area, as equal

    proportions are assumed for each

    category. Click OK.

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    24/32

    5. SPSS Output for Chi-Square Goodness-of-Fit Test

    (a) This table provides the observed frequencies (Observed N) for each gift as well as

    the expected frequencies (Expected N), which are the frequencies expected if the null

    hypothesis is true. The difference between the observed and expected frequencies isprovided in the Residual column.

    (b) This table provides the results of the Chi-Square Goodness-of-Fit test. We can see

    from this table that the test statistic is statistically significant: 2(1) = 46.2, p < .0001.

    We can, therefore, reject the null hypothesis and conclude that there are statistically

    significant differences in the preference of the type of Valentines Day gift, with less

    girls preferring chocolates(N = 174) compared to flowers (N = 326).

    Gift

    Observed N Expected N Residual

    Flowers 326 250.0 76.0

    Chocolate 174 250.0 -76.0

    Total 500

    Test Statistics

    Gift

    Chi-Square 46.208a

    Df 1

    Asymp. Sig. .000

    a. 0 cells (.0%) have

    expected frequencies

    less than 5. The

    minimum expected cell

    frequency is 250.0.

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    25/32

    Test for Independence

    Example 2

    Samantha wants to know if there are gender differences inpreferred colour (Pink vs. Blue). She hypothesizes that Males will

    prefer Blue while Females will prefer Pink. She surveyed 20

    males and 20 females and obtained the following results:

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    26/32

    Steps for using SPSS:

    1. Create variables and code for Gender and Colour.

    2. Enter the data.

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    27/32

    3. Start analysis

    (a) Click Analyze > Descriptive Statistics (b) Transfer one of the variables

    > Crosstabs... into the "Row(s):" box and the

    other variable into the "Column(s):"

    box. Click on "Display

    clustered bar charts".

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    28/32

    (c) Click on the Statistics...button. (d) Click the Cells... button. Select

    Select the "Chi-square" and "Phi and "Observed" from the "Counts" area

    Cramer's V" options. Click Continue. and "Row", "Column" and "Total"

    from the "Percentages" area. Click

    Continue.

    (e) Click OK to generate the output.

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    29/32

    4. SPSS Output for Chi-Square Independence Test:

    (a) This table shows us that more males prefer blue while more females prefer pink:

    Gender * Colour Crosstabulation

    Colour

    Total1 2

    Gender 1 Count 2 18 20

    % within

    Gender

    10.0% 90.0% 100.0%

    % within Colour 12.5% 75.0% 50.0%

    % of Total 5.0% 45.0% 50.0%

    2 Count 14 6 20

    % within

    Gender

    70.0% 30.0% 100.0%

    % within Colour 87.5% 25.0% 50.0%

    % of Total 35.0% 15.0% 50.0%

    Total Count 16 24 40

    % within

    Gender

    40.0% 60.0% 100.0%

    % within Colour 100.0% 100.0% 100.0%

    % of Total 40.0% 60.0% 100.0%

  • 7/30/2019 MAP812_The Chi-Square Statistic_Samantha Ng

    30/32

    (b) This table provides the results of the Chi-Square Independence test. We can

    see from this table that the test statistic is statistically significant: 2(1) = 15.0,p