Home / AP Statistics 8.2 Setting Up a Chi-Square Goodness  of Fit Test Study Notes

AP Statistics 8.2 Setting Up a Chi-Square Goodness  of Fit Test Study Notes

AP Statistics 8.2 Setting Up a Chi-Square Goodness  of Fit Test Study Notes- New syllabus

AP Statistics 8.2 Setting Up a Chi-Square Goodness  of Fit Test Study Notes -As per latest AP Statistics Syllabus.

LEARNING OBJECTIVE

  • The chi-square distribution may be used to model variation.

Key Concepts:

  • Chi-Square Distributions
  • Null and Alternative Hypotheses in a Chi-Square Goodness-of-Fit Test
  • Testing Method for a Distribution of Proportions (Categorical Data)
  • Testing Method for a Distribution of Proportions (Categorical Data)

AP Statistics -Concise Summary Notes- All Topics

Chi-Square Distributions

Chi-Square Distributions

 A chi-square (\(\chi^2\)) distribution is a family of probability distributions used to model the distribution of the chi-square test statistic, which measures how far observed counts deviate from expected counts relative to expected counts.

Expected counts: Counts that would occur if the null hypothesis were true.

Formula: \(\displaystyle E_i = N \times p_i\), where \(N\) is the sample size and \(p_i\) is the hypothesized proportion.

Chi-square statistic: Measures the total relative deviation between observed and expected counts.

Formula: \(\displaystyle \chi^2 = \sum_{i=1}^k \dfrac{(O_i – E_i)^2}{E_i}\)

Distribution shape:

  • Chi-square values are always positive (\(\chi^2 \ge 0\)).
  • Chi-square distributions are skewed right, especially for small degrees of freedom (df).
  • Skewness decreases as degrees of freedom increase; the distribution becomes more symmetric.
  • Degrees of freedom for goodness-of-fit: \(df = k – 1\), where \(k\) is the number of categories.

Notes:

  • Chi-square distributions are used for tests involving categorical data, such as goodness-of-fit, homogeneity, and independence.
  • As the degrees of freedom increase, the mean of the chi-square distribution increases (\(\text{mean} = df\)), and the distribution becomes more bell-shaped.

Example 

A survey collects preferences for 3 different types of ice cream among 60 students. If the null hypothesis states equal preference, the expected counts are 20 per type. Observed counts are 18, 22, and 20.

Describe the chi-square distribution that would be used for testing goodness-of-fit.

▶️ Answer / Explanation

Step 1 — Calculate expected counts:

\(E_i = 60 \times \frac{1}{3} = 20\) for each flavor.

Step 2 — Identify degrees of freedom:

Number of categories \(k = 3\), so \(df = k – 1 = 2\).

Step 3 — Describe distribution:

  • Chi-square values are always positive (\(\chi^2 \ge 0\)).
  • The distribution is skewed right because df = 2 is small.
  • The mean of this chi-square distribution is \(df = 2\).
  • The distribution will be used to calculate the probability of observing a chi-square statistic as extreme as the one computed from the data.

Example 

A candy company claims that their four-color candy packs contain equal numbers of red, green, blue, and yellow candies. A sample of 80 candies is collected.

 Calculate the expected count for each color under the null hypothesis.

▶️ Answer / Explanation

Total sample size: \(N = 80\)

Expected proportion for each color: \(p_i = 0.25\)

Expected count: \(E_i = N \times p_i = 80 \times 0.25 = 20\)

So, we expect 20 candies of each color if the null hypothesis is true.

Example

Observed counts of candy colors: Red = 18, Green = 22, Blue = 20, Yellow = 20. Expected counts: 20 each.

 Calculate the chi-square statistic.

▶️ Answer / Explanation

\(\chi^2 = \sum \frac{(O_i – E_i)^2}{E_i} = \frac{(18-20)^2}{20} + \frac{(22-20)^2}{20} + \frac{(20-20)^2}{20} + \frac{(20-20)^2}{20}\)

\(\chi^2 = \frac{4}{20} + \frac{4}{20} + 0 + 0 = 0.4 + 0.4 = 0.8\)

The chi-square statistic is 0.8, measuring how far the observed counts deviate from expected counts relative to expected counts.

Null and Alternative Hypotheses in a Chi-Square Goodness-of-Fit Test

Null and Alternative Hypotheses in a Chi-Square Goodness-of-Fit Test

Purpose: The hypotheses state what the researcher expects about the distribution of categorical data and what they want to test.

Null Hypothesis (\(H_0\)): The population proportions are equal to the claimed or expected proportions.

Example: \(H_0: p_\text{Red} = 0.25, p_\text{Green} = 0.25, p_\text{Blue} = 0.25, p_\text{Yellow} = 0.25\)

Alternative Hypothesis (\(H_a\)): At least one of the population proportions differs from the expected proportion.

Example: \(H_a: \text{At least one } p_i \neq \text{expected value}\)

Notes:

  • The chi-square test is always two-sided: it detects deviations in any direction from the expected distribution.
  • The hypotheses must be stated in terms of population proportions, not sample proportions.

Example 

A candy company claims that their four-color candy packs contain equal numbers of red, green, blue, and yellow candies. A random sample of 80 candies is collected.

State the null and alternative hypotheses for a chi-square goodness-of-fit test.

▶️ Answer / Explanation

Step 1 — Null Hypothesis:

\(H_0:\) The population proportions are equal: \(p_\text{Red} = p_\text{Green} = p_\text{Blue} = p_\text{Yellow} = 0.25\)

Step 2 — Alternative Hypothesis:

\(H_a:\) At least one population proportion differs from the expected 0.25. The observed counts differ from what would be expected if all proportions were equal.

Step 3 — Notes: The chi-square test will evaluate whether the observed deviations from 0.25 for any color are too large to attribute to random chance.

Testing Method for a Distribution of Proportions (Categorical Data)

Testing Method for a Distribution of Proportions (Categorical Data)

Purpose: To determine whether the observed counts in categorical data are consistent with a claimed distribution of population proportions.

Appropriate Test:

Chi-Square Goodness-of-Fit Test: Used when you have one categorical variable with \(k\) categories and you want to compare the observed counts to expected counts based on a hypothesized distribution.

Requirements:

  • Observations are from a random sample or randomized experiment.
  • Expected counts for each category are at least 5.
  • Observations are independent (10% condition if sampling without replacement).
  • Null hypothesis (\(H_0\)): Observed counts match the expected distribution.
  • Alternative hypothesis (\(H_a\)): Observed counts do not match the expected distribution.

Notes:

  • This test is always two-sided — it detects any deviation from the expected distribution.
  • Test statistic: \(\chi^2 = \sum \dfrac{(O_i – E_i)^2}{E_i}\), compared to a chi-square distribution with \(df = k – 1\).

Example 

A candy company claims that their four-color candy packs contain equal numbers of red, green, blue, and yellow candies. A random sample of 80 candies is collected.

 What is the appropriate statistical test to determine if the observed counts differ from the claimed distribution?

▶️ Answer / Explanation

Step 1 — Identify the scenario:

One categorical variable (candy color) with four categories; comparing observed counts to expected counts.

Step 2 — Choose the test:

The appropriate test is the Chi-Square Goodness-of-Fit Test.

Step 3 — Conditions:

  • Random sample of candies ✅
  • Expected counts: 80 × 0.25 = 20 ≥ 5 ✅
  • Observations are independent ✅

All conditions are satisfied; the chi-square goodness-of-fit test can be applied.

Testing Method for a Distribution of Proportions (Categorical Data)

Conditions for Chi-Square Goodness-of-Fit Test

Before performing a chi-square goodness-of-fit test, the following conditions must be verified to ensure valid inferences:

1. Random: The data must come from a random sample or randomized experiment.

This ensures that the sample represents the population.

2. Expected Counts Large Enough: Each expected count \(E_i\) should be at least 5.

This ensures that the chi-square approximation is valid.

3. Independence: Observations must be independent.

If sampling without replacement, the 10% condition must be met: \(n \le 0.1 N\), where \(N\) is the population size.

Notes:

  • Failure to meet these conditions may invalidate the results of the test.
  • Randomness, independence, and sufficiently large expected counts together ensure that the chi-square statistic follows the chi-square distribution approximately under \(H_0\).

Example 

A candy company claims that its four-color candy packs contain equal numbers of red, green, blue, and yellow candies. A random sample of 80 candies produces:

  • Observed: Red = 18, Green = 22, Blue = 20, Yellow = 20
  • Expected: 20 each

 Verify that the conditions are satisfied for a chi-square goodness-of-fit test.

▶️ Answer / Explanation

Step 1 — Random: The candies are randomly selected.

Step 2 — Expected counts: Each expected count \(E_i = 20\) ≥ 5.

Step 3 — Independence: Each candy is counted independently, and the sample size is much less than the total population (satisfies 10% condition).

Conclusion: All conditions are met; it is appropriate to perform the chi-square goodness-of-fit test.

Scroll to Top