IB Mathematics AI SL Formulation of null and alternative hypotheses MAI Study Notes- New Syllabus
IB Mathematics AI SL Formulation of null and alternative hypotheses MAI Study Notes
LEARNING OBJECTIVE
- Formulation of null and alternative hypotheses,
Key Concepts:
- Formulation of Hypotheses
- Significance Levels and p-values
- Expected vs. Observed Frequencies
- χ² Test for Independence
- χ² Goodness of Fit Test
- t-Tests and Population Means
- Comparison One-tailed vs. Two-tailed Tests
- IBDP Maths AI SL- IB Style Practice Questions with Answer-Topic Wise-Paper 1
- IBDP Maths AI SL- IB Style Practice Questions with Answer-Topic Wise-Paper 2
- IB DP Maths AI HL- IB Style Practice Questions with Answer-Topic Wise-Paper 1
- IB DP Maths AI HL- IB Style Practice Questions with Answer-Topic Wise-Paper 2
- IB DP Maths AI HL- IB Style Practice Questions with Answer-Topic Wise-Paper 3
FORMULATION AND TEST OF HYPOTHESES
Formulation of Hypotheses
Hypothesis testing allows us to assess claims about a population using sample data.
Null hypothesis: $H_0$
Alternative hypothesis: $H_1$ or $H_a$
Types of Hypotheses
One-tailed test:
$H_1: \mu >\mu_0$ (right-tailed)
$H_1: \mu < \mu_0$ (left-tailed)
Two-tailed test:
$H_1: \mu \ne \mu_0$
Example A researcher claims the average weight of a population is 75 kg. A student believes this claim is incorrect and takes a random sample of 40 people. Justify his claim. ▶️ Answer/ExplanationThe sample has: Null Hypothesis $H_0$: $\mu = 75$ → This is a two-tailed test, since we are checking if the mean is not equal to 75. $z = \frac{\bar{x} – \mu_0}{\sigma / \sqrt{n}} = \frac{72.5 – 75}{6 / \sqrt{40}} = \frac{-2.5}{0.9487} \approx -2.635$ For a two-tailed test at $\alpha = 0.05$, the critical values are: $-2.635 < -1.96$, the test statistic falls in the rejection region. At the 5% level of significance, there is sufficient evidence to conclude that the population mean is not equal to 75 kg. |
SIGNIFICANCE LEVELS AND P-VALUES
Significance Levels and p-values
Significance Level (α):
Common values: $0.05 (5\%), 0.01 (1\%), 0.10 (10\%)$
Defines the cutoff for rejecting $H_0$.
p-value:
The probability of observing a result as extreme or more extreme than the actual sample result, assuming H₀ is true.
Interpretation:
If $p\text{-value} < \alpha$: Reject $H_0$ $\Rightarrow$ evidence supports $H_1$.
If $p\text{-value} \geq \alpha$: Fail to reject $H_0$ $\Rightarrow$ insufficient evidence to support $H_1$.
Calculator Use (TI or GDC):
Use $\text{normalcdf}$ or $\text{tcdf}$ functions to calculate p-values based on the test statistic.
Example To compare the mean weights between two populations A and B we obtain two samples: GDC gives that the two sample means are $\bar{x}_1 = 70.1$ and $\bar{x}_2 = 63.1$.
▶️ Answer/Explanation(a) Ann’s Claim We perform a one-tailed t-test: GDC gives p-value = 0.041 (b) Bill’s Claim We perform a two-tailed t-test: GDC gives p-value = 0.082 |
EXPECTED VS. OBSERVED FREQUENCIES
Expected vs. Observed Frequencies
Observed Frequency (O): data from the sample
Expected Frequency (E): predicted assuming $H_0$ is true
$
E = \frac{(\text{row total}) \times (\text{column total})}{\text{grand total}}
$
Example – Expected vs. Observed Frequencies In a survey of 80 people, we inquired about their preferred sport: ▶️ Answer/ExplanationObserved frequencies: Expected frequencies: For the first entry: $E = \frac{(\text{column total}) \times (\text{row total})}{\text{grand total}} = \frac{30 \times 36}{80} = 13.5$ |
CHI-SQUARE TEST FOR INDEPENDENCE
Chi-Square Test for Independence
Test whether two categorical variables are independent.
Test Statistic:
$
\chi^2 = \sum \frac{(O – E)^2}{E}
$
Degrees of Freedom:
$
\text{df} = (r – 1)(c – 1)
$
Conditions:
- Expected frequencies $\geq 5$
- Random sample
Example In a survey of 80 people, we inquired about their preferred sport Test if the favorite sport is independent of the gender. Use the significance level $\alpha=0.05$. ▶️ Answer/Explanation$H_0$: gender and favorite sport are independent GDC gives $\chi^2\text{ statistic} = 7.00$ Since $p\text{-value} < 0.05$ we reject $H_0$. |
Chi-Square Goodness of Fit Test
Test if a sample distribution matches a theoretical distribution (e.g., uniform).
Test Statistic:
$
\chi^2 = \sum \frac{(O – E)^2}{E}
$
Degrees of Freedom:
$
\text{df} = k – 1
$
Where $k$ = number of categories
Chi-Squared Goodness of Fit Test – Steps
Step 1: Set Up Hypotheses
- H₀: The observed data fits the expected distribution.
- H₁: The observed data does not fit the expected distribution.
- Clearly define the variable you are testing.
Step 2: Determine Expected Frequencies
- Use total sample size and the proposed model to calculate expected values.
- For a uniform distribution: divide total frequency by the number of categories.
Step 3: Calculate Degrees of Freedom
- \( \nu = \text{number of categories} – 1 \)
- This is needed to reference chi-squared distribution tables or compute the p-value.
Step 4: Use Technology (GDC)
- Input Observed and Expected values as lists.
- Use the chi-squared test function to find:
- Chi-squared statistic: \( \chi^2_{\text{calc}} \)
- p-value
Step 5: Make a Decision
- Compare with a significance level (e.g., \( \alpha = 0.05 \))
- Option A: If \( \chi^2_{\text{calc}} > \chi^2_{\text{critical}} \), reject H₀
- Option B: If p-value < \( \alpha \), reject H₀
Step 6: Conclusion
- Reject H₀: Evidence suggests the data does not follow the given distribution.
- Fail to reject H₀: No strong evidence against the given distribution.
Example Philipp claims that the supporters of Football teams A, B, C and D are as follows:
In a sample of 40 people, we found:
We test Philipp’s claim for $\alpha = 0.05$. ▶️ Answer/ExplanationGoodness of fit Chi-squared test with $H_0$: data follow the given distribution
Use GDC: Statistics → TEST → CHI → GOF GDC gives $\chi^2 \text{ statistic} = 0.210$ Since $p\text{-value} > 0.05$, we do not have enough evidence to reject $H_0$. We may accept that Philipp’s claim about the distribution of the people is true. |
t-Tests and Comparison of Means
When population standard deviation is unknown → use t-distribution
One-sample t-test:
$
t = \frac{\bar{x} – \mu}{\frac{s}{\sqrt{n}}}
$
Where:
$\bar{x}$ = sample mean
$\mu$ = population mean
$s$ = sample standard deviation
$n$ = sample size
$\text{df} = n – 1$
Two-sample t-test (unpaired):
$
t = \frac{\bar{x}_1 – \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}
$
Where:
$\bar{x}_1, \bar{x}_2$ = sample means
$s_1, s_2$ = sample standard deviations
$n_1, n_2$ = sample sizes
Degrees of freedom: approximately
$
\text{df} = \min(n_1 – 1, n_2 – 1)
$
Assumptions:
Independent samples
Normal distribution (or large sample size)
Equal variances not required (Welch’s test used)
ONE-TAILED VS. TWO-TAILED TESTS
One-tailed vs. Two-tailed Tests
One-tailed Test:
Tests for effect in a single direction:
$
H_1: \mu \mu_0 \quad \text{or} \quad H_1: \mu < \mu_0
$
Two-tailed Test:
Tests for difference in either direction:
$
H_1: \mu \ne \mu_0
$
Critical Region:
One-tailed: entire $\alpha$ in one tail
Two-tailed: split $\alpha$ into both tails
(e.g., 0.025 in each if $\alpha = 0.05$)
Example To compare the mean weights between two populations A and B we obtain two samples: (GDC gives that the two sample means are \(\bar{x}_1 = 70.1\) and \(\bar{x}_2 = 63.1\)) We will test two different claims for the population means \(\mu_1, \mu_2\) with \(\alpha = 0.05\): (a) Ann claims that \(\mu_1 > \mu_2\) (b) Bill claims that population means are different ▶️ Answer/ExplanationSolution (a) We perform a one-tailed t-test: $ \begin{aligned} H_0 &: \mu_1 = \mu_2 \\ H_1 &: \mu_1 > \mu_2 \end{aligned} $ GDC gives p-value = 0.041 Since p-value < 0.05, we reject \(H_0\). That is, we accept Ann’s claim that \(\mu_1 > \mu_2\). (b) We perform a two-tailed t-test: $ \begin{aligned} H_0 &: \mu_1 = \mu_2 \\ H_1 &: \mu_1 \ne \mu_2 \end{aligned} $ GDC gives p-value = 0.082 Since p-value > 0.05, we do not have enough evidence to reject \(H_0\). Bill is not right! |