AP Statistics 8.5 Setting Up a Chi-Square Test for Homogeneity or Independence Study Notes
AP Statistics 8.5 Setting Up a Chi-Square Test for Homogeneity or Independence Study Notes- New syllabus
AP Statistics 8.5 Setting Up a Chi-Square Test for Homogeneity or Independence Study Notes -As per latest AP Statistics Syllabus.
LEARNING OBJECTIVE
- The chi-square distribution may be used to model variation.
Key Concepts:
- Null and Alternative Hypotheses for Chi-Square Tests (Homogeneity or Independence)
- Testing Method for Comparing Distributions in Two-Way Tables
- Verifying Conditions for Chi-Square Tests (Independence or Homogeneity)
Null and Alternative Hypotheses for Chi-Square Tests (Homogeneity or Independence)
Null and Alternative Hypotheses for Chi-Square Tests (Homogeneity or Independence)
Chi-square tests for two-way tables are used to assess either:
- Homogeneity: Whether the distribution of a categorical variable is the same across several populations.
- Independence: Whether two categorical variables are associated in a single population.
Hypotheses:
Null Hypothesis (\(H_0\)): The categorical variables are independent, or the distributions are the same across groups.
Example: \(H_0: \text{Snack preference is independent of gender}\)
Alternative Hypothesis (\(H_a\)): The categorical variables are not independent, or at least one distribution differs.
Example: \(H_a: \text{Snack preference depends on gender}\)
Notes:
- Chi-square tests are always two-sided: they detect any deviation from independence or equality of distributions.
- Hypotheses must refer to population proportions, not sample counts.
Example
A school surveys 100 students about snack preference (Chips, Candy) and gender (Male, Female). The observed counts are collected in a two-way table.
Identify the null and alternative hypotheses for a chi-square test for independence.
▶️ Answer / Explanation
Step 1 — Null Hypothesis (\(H_0\)):
Snack preference is independent of gender. The distribution of snack preference is the same for males and females.
Step 2 — Alternative Hypothesis (\(H_a\)):
Snack preference depends on gender. At least one proportion differs between males and females.
Step 3 — Notes:
- These hypotheses refer to the population as a whole, not the sample.
- This sets up the chi-square test for independence or homogeneity, depending on context.
Testing Method for Comparing Distributions in Two-Way Tables
Testing Method for Comparing Distributions in Two-Way Tables
Purpose: To determine whether categorical variables are associated or whether distributions are the same across populations.
Chi-Square Test for Homogeneity: Used when comparing the distributions of a categorical variable across two or more populations.
Example: Do snack preferences (Chips, Candy) differ between students from two schools?
Chi-Square Test for Independence: Used to determine whether two categorical variables are associated in a single population.
Example: Is snack preference (Chips, Candy) independent of gender (Male, Female)?
Requirements for both tests:
- Random sample(s)
- Expected counts ≥ 5 in each cell
- Independent observations (10% condition if sampling without replacement)
Notes:
- Both tests use the chi-square statistic: \(\displaystyle \chi^2 = \sum \dfrac{(O_i – E_i)^2}{E_i}\)
- Degrees of freedom: \(df = (\text{number of rows} – 1) \times (\text{number of columns} – 1)\)
- The choice between homogeneity and independence depends on the study design.
Example
A school surveys students from two different classes about their favorite snack (Chips, Candy). The counts are recorded in a two-way table.
What is the appropriate statistical test to compare distributions of snack preference between the two classes?
▶️ Answer / Explanation
Step 1 — Identify the goal:
We want to compare distributions of snack preference across two populations (two classes).
Step 2 — Choose the test:
The appropriate test is the Chi-Square Test for Homogeneity.
Step 3 — Conditions:
- Random samples from each class
- Expected counts for each cell ≥ 5
- Observations are independent
All conditions are satisfied; the chi-square test for homogeneity can be applied to compare the distributions.
Verifying Conditions for Chi-Square Tests (Independence or Homogeneity)
Verifying Conditions for Chi-Square Tests (Independence or Homogeneity)
Before performing a chi-square test, we must ensure the conditions for valid statistical inference are met.
Conditions:
Randomness: Data should come from a random sample or a randomized experiment.
- For homogeneity: each population must be sampled randomly.
- For independence: the sample from the population should be random.
Independence: Observations must be independent.
- Check the 10% condition if sampling without replacement: \(n \le 0.1N\).
Expected Counts: Each cell in the two-way table must have an expected count ≥ 5.
Formula: \(\displaystyle E_{ij} = \frac{(\text{row total}_i)(\text{column total}_j)}{\text{grand total}}\)
Notes:
- If all conditions are satisfied, the chi-square distribution can be used to calculate probabilities for the test statistic.
- Failure to meet these conditions may invalidate the results of the chi-square test.
Example
A survey records 60 students’ snack preferences (Chips, Candy) and gender (Male, Female) in a two-way table. Before testing independence, check the conditions:
Chips | Candy | Row Total | |
---|---|---|---|
Male | 30 | 20 | 50 |
Female | 10 | 40 | 50 |
Column Total | 40 | 60 | 100 |
▶️ Answer / Explanation
Step 1 — Randomness: Sample of students is random
Step 2 — Independence: Each student’s response is independent. 10% condition satisfied if sampling without replacement
Step 3 — Expected counts:
- Male & Chips: \(E = \frac{50 \times 40}{100} = 20\)
- Male & Candy: \(E = \frac{50 \times 60}{100} = 30\)
- Female & Chips: \(E = \frac{50 \times 40}{100} = 20\)
- Female & Candy: \(E = \frac{50 \times 60}{100} = 30\)
All expected counts ≥ 5, so this condition is satisfied.
Conclusion: All conditions for performing a chi-square test for independence are satisfied.