AP Statistics 8.6 Carrying Out a Chi-Square Test for Homogeneity or Independence Study Notes
AP Statistics 8.6 Carrying Out a Chi-Square Test for Homogeneity or Independence Study Notes- New syllabus
AP Statistics 8.6 Carrying Out a Chi-Square Test for Homogeneity or Independence Study Notes -As per latest AP Statistics Syllabus.
LEARNING OBJECTIVE
- The chi-square distribution may be used to model variation.
Key Concepts:
- Chi-Square Test Statistic for Homogeneity or Independence’
- Chi-Square Test: Determining and Interpreting the P-Value
- Justifying a Claim Based on a Chi-Square Test for Homogeneity or Independence
Chi-Square Test Statistic for Homogeneity or Independence
Chi-Square Test Statistic for Homogeneity or Independence
The chi-square statistic measures the overall difference between observed counts (\(O_{ij}\)) and expected counts (\(E_{ij}\)) in a two-way table. It quantifies how much the observed data deviate from the counts expected if the null hypothesis were true.
Formula:
\(\displaystyle \chi^2 = \sum_{i=1}^{r} \sum_{j=1}^{c} \dfrac{(O_{ij} – E_{ij})^2}{E_{ij}}\)
- \(O_{ij}\) = observed count in row \(i\), column \(j\)
- \(E_{ij}\) = expected count in row \(i\), column \(j\)
- \(r\) = number of rows, \(c\) = number of columns
Notes:
- Expected counts: \(\displaystyle E_{ij} = \frac{(\text{row total}_i)(\text{column total}_j)}{\text{grand total}}\)
- Degrees of freedom: \(df = (r-1)(c-1)\)
- The chi-square statistic is always positive and larger values indicate greater deviation from independence or homogeneity.
Example
Observed counts of snack preference by gender among 100 students:
Chips | Candy | Row Total | |
---|---|---|---|
Male | 30 | 20 | 50 |
Female | 10 | 40 | 50 |
Column Total | 40 | 60 | 100 |
Calculate the chi-square statistic for testing independence of snack preference and gender.
▶️ Answer / Explanation
Step 1 — Calculate expected counts:
- Male & Chips: \(E = \frac{50 \times 40}{100} = 20\)
- Male & Candy: \(E = \frac{50 \times 60}{100} = 30\)
- Female & Chips: \(E = \frac{50 \times 40}{100} = 20\)
- Female & Candy: \(E = \frac{50 \times 60}{100} = 30\)
Step 2 — Compute chi-square contributions for each cell:
- Male & Chips: \(\frac{(30-20)^2}{20} = \frac{100}{20} = 5\)
- Male & Candy: \(\frac{(20-30)^2}{30} = \frac{100}{30} \approx 3.33\)
- Female & Chips: \(\frac{(10-20)^2}{20} = \frac{100}{20} = 5\)
- Female & Candy: \(\frac{(40-30)^2}{30} = \frac{100}{30} \approx 3.33\)
Step 3 — Sum contributions:
\(\chi^2 = 5 + 3.33 + 5 + 3.33 \approx 16.66\)
Conclusion: The chi-square statistic is approximately 16.66. This value will be compared to a chi-square distribution with \(df = (2-1)(2-1) = 1\) to determine the p-value.
Chi-Square Test: Determining and Interpreting the P-Value
Chi-Square Test: Determining and Interpreting the P-Value
The p-value measures the probability of obtaining a chi-square statistic as extreme or more extreme than the observed value, assuming the null hypothesis is true.
Steps:
- Calculate the chi-square statistic: \(\displaystyle \chi^2 = \sum \dfrac{(O_{ij} – E_{ij})^2}{E_{ij}}\)
- Determine degrees of freedom: \(df = (\text{rows}-1)(\text{columns}-1)\)
- Find the p-value: Using a chi-square distribution table or software, find the probability of observing \(\chi^2\) at least as large as the calculated value.
- Interpret the p-value:
- Small p-value (\(< \alpha\)) → reject \(H_0\); evidence of association or difference
- Large p-value (\(> \alpha\)) → fail to reject \(H_0\); insufficient evidence of association or difference
Notes:
- The p-value assumes the null hypothesis is true.
- For chi-square tests, larger values of \(\chi^2\) correspond to smaller p-values.
Example
From a two-way table of snack preference by gender among 100 students, the chi-square statistic was calculated as \(\chi^2 = 16.66\) with \(df = 1\). Significance level \(\alpha = 0.05\).
Determine and interpret the p-value.
▶️ Answer / Explanation
Step 1 — Use chi-square distribution:
With \(\chi^2 = 16.66\) and \(df = 1\), the p-value ≈ 0.00004 (from table or software)
Step 2 — Compare to significance level:
p-value < 0.05 → reject \(H_0\)
Step 3 — Interpretation:
There is strong evidence that snack preference and gender are not independent. The distribution of snack preference differs by gender in the population.
Justifying a Claim Based on a Chi-Square Test for Homogeneity or Independence
Justifying a Claim Based on a Chi-Square Test for Homogeneity or Independence
The purpose of this to use the results of a chi-square test to draw conclusions about the population(s) from which the data were collected.
Steps for Justifying a Claim:
1.Compare the p-value to the significance level (\(\alpha\)):
- If p-value < \(\alpha\) → reject \(H_0\)
- If p-value ≥ \(\alpha\) → fail to reject \(H_0\)
2.Make a conclusion in context:
- Reject \(H_0\): There is sufficient evidence that the variables are not independent (association exists) or that distributions differ between populations.
- Fail to reject \(H_0\): There is insufficient evidence of association or difference; we cannot conclude that the null hypothesis is true.
3.Relate to the research question:
- For independence: draw conclusions about the population from which the data were sampled.
- For homogeneity: draw conclusions about the populations being compared.
Notes:
- Always state the conclusion in terms of the context of the variables and the population(s).
- A chi-square test does not “prove” anything; it only provides statistical evidence supporting or failing to support a claim.
Example
A survey of 100 students records snack preference (Chips, Candy) and gender (Male, Female). Chi-square test for independence gives \(\chi^2 = 16.66\), p-value ≈ 0.00004, \(\alpha = 0.05\).
Justify a claim about the population based on the test results.
▶️ Answer / Explanation
Step 1 — Compare p-value to \(\alpha\):
p-value = 0.00004 < 0.05 → reject \(H_0\)
Step 2 — State conclusion in context:
There is strong statistical evidence that snack preference is associated with gender in the population of students surveyed. The distribution of snack preference differs between males and females.
Step 3 — Relate to research question:
The chi-square test supports the claim that gender and snack preference are not independent among students. This conclusion is based on the sample and applies to the population from which the sample was drawn.