IB Mathematics AHL 4.13 Non-linear regression. AI HL Paper 3- Exam Style Questions- New Syllabus

Question

An investigation is carried out to contrast the final results of two different educational institutions, School $A$ and School $B$.

Every student at both institutions sits the same concluding examination at age $18$. A researcher named Aayush is evaluating these outcomes.

Aayush selects a representative group of $6$ students from each institution to participate in detailed interviews regarding their performance.

(a) (i) Identify one benefit of utilizing a larger sample size for this study.

(ii) Identify one drawback of increasing the number of students in the sample.

The percentages achieved by the $6$ sampled students from School $A$ are presented in Table $1$ below.

The average score for the School $A$ group is $51.7$ when rounded to $3$ significant figures.

(b) Calculate the value of the unbiased estimate of the population standard deviation, $s_{n-1}$, for the School $A$ sample.

The corresponding $s_{n-1}$ value for the School $B$ group is recorded as $7.66$.

Aayush suggests that: “The variation in marks at School $A$ is definitely lower than the variation at School $B$.”

The exam authorities state that scores in both institutions follow an approximately normal distribution. Assume this information is valid.

Aayush intends to perform a pooled $t$-test to determine if there is a significant difference between the average marks of the two schools.

(d) (i) Specify the requirement regarding population variances that must be met to perform a pooled $t$-test.

(ii) Determine if it is appropriate for Aayush to proceed with a pooled $t$-test here. Support your choice.

Before the study, Aayush hypothesized that School $B$ would have a higher average score than School $A$. His data shows that the average score for the School $B$ sample is exactly $60$.

(e) (i) Formulate the null and alternative hypotheses for this pooled $t$-test.

(ii) Compute the $p$-value for the test.

(iii) Using a $5\%$ significance level, state the conclusion of the test in the context of the study. Provide a justification.

At age $11$, all students had previously taken the same entrance test.

Aayush wants to see if there is a relationship between the initial entrance test score and the final exam score. He obtains the age $11$ results for the $12$ students in his samples.

The Pearson’s product-moment correlation coefficient for these two sets of scores is $r = 0.876$. The critical value for $r$ at a $5\%$ significance level is $0.576$.

(f) (i) Conduct a hypothesis test at the $5\%$ level of significance. State your hypotheses and whether you reject the null hypothesis.

(ii) If the assumptions for a Pearson’s test are violated, name a more suitable correlation test to use.

The exam board uses a linear model to forecast a student’s final mark ($\hat{y}$) from their entrance mark ($x$): $\hat{y} = 0.37x + 37.6$

(g) Explain the contextual meaning of the gradient $0.37$ in this specific model.

Aayush calculates the “school value added” for each student using the formula $y – \hat{y}$ (rounded to $1$ decimal place). These values are shown in Table $2$.

(h) (i) Verify that the value represented by $q$ in Table $2$ is $0.6$.

(ii) Use a pooled $t$-test at the $5\%$ level to check if the average “school value added” is greater in School $A$ than in School $B$. State your hypotheses and justify your final decision.

(i) Based on the various tests performed, explain how both School $A$ and School $B$ could each argue that they are the superior institution.

Most-appropriate topic codes (IB Mathematics Applications and Interpretation HL):

• AHL 4.18: Hypothesis testing for population mean ($t$-tests) — parts (d), (e), (h)(ii)
• AHL 4.14: Unbiased estimates of population parameters ($s_{n-1}$) — parts (b), (c)
• AHL 4.13: Least squares regression analysis and predictions — part (g)
• SL 4.10: Spearman’s rank correlation coefficient — part (f)(ii)
• SL 4.4: Linear correlation and regression — part (f)(i)
• SL 4.1: Concepts of sampling and reliability — part (a)

▶️ Answer/Explanation

(a)(i)
Answer: The sample will be more reliable / more representative of the population / the sample mean is likely to be closer to the population mean. [cite: 1276]

(a)(ii)
Answer: More time consuming / expensive / more open to human error. [cite: 1411]

(b)
Method: Using GDC with data: $63.5, 52.5, 50.7, 42.8, 44.7, 56.1$
$\boxed{7.60}$ ($7.597537…$) [cite: 1431]

(c)
Answer: The population standard deviation may be different from the sample / these $s_{n-1}$ values are only estimates and the values are too close together / standard deviation is not the only measure of spread. [cite: 1431]

(d)(i)
Answer: The population variances are the same/equal. [cite: 1389]

(d)(ii)
Method: Compare sample standard deviations or variances
Sample $A$: $s_{n-1} = 7.60$, Sample $B$: $s_{n-1} = 7.66$
These are similar, so assumption of equal variances is plausible.
Yes, Aayush should use a pooled $t$-test. [cite: 1389]

(e)(i)
Answer: $H_0: \mu_A = \mu_B$, $H_1: \mu_A < \mu_B$ [cite: 1385]

(e)(ii)
Method: Using GDC with pooled $t$-test
Sample $A$: $n=6, \bar{x}=51.7, s=7.60$
Sample $B$: $n=6, \bar{x}=60, s=7.66$
$\boxed{0.0445}$ ($0.0444586…$) [cite: 1389]

(e)(iii)
Method: Compare $p$-value with significance level $\alpha = 0.05$
$0.0445 < 0.05$ $\Rightarrow$ Reject $H_0$
There is significant evidence that the population mean of School $B$ is higher than the population mean of School $A$ / Aayush’s belief is supported. [cite: 1385]

(f)(i)
Method: Correlation hypothesis test
$H_0: \rho = 0, H_1: \rho \neq 0$
Test statistic: $r = 0.876$
Critical value: $0.576$ (at $5\%$ significance level for $n=12$)
$0.876 > 0.576$ $\Rightarrow$ Reject $H_0$
There is significant evidence of correlation between entry exam and final exam results. [cite: 1465]

(f)(ii)
$\boxed{\text{Spearman’s rank correlation coefficient}}$ [cite: 1372]

(g)
As entry exam result increases by $1$ percentage point, the final exam result increases by $0.37$ percentage points.

(h)(i)
Method: For candidate $10003$: $x = 38.7$
Predicted: $\hat{y} = 0.37 \times 38.7 + 37.6 = 14.319 + 37.6 = 51.919$
School value added: $y – \hat{y} = 52.5 – 51.919 = 0.581$
Rounded to $1$ decimal place: $0.6$
$\boxed{0.6}$

(h)(ii)
Method: Pooled $t$-test on school value added data
$H_0: \mu_A = \mu_B$ (mean school value added equal)
$H_1: \mu_A > \mu_B$ (mean school value added higher in School $A$)
Using GDC with pooled $t$-test:
$p$-value $= 0.0213$ ($0.0212715…$)
$0.0213 < 0.05$ $\Rightarrow$ Reject $H_0$
There is evidence of higher school value added in School $A$. [cite: 1389, 1465]

(i)
Answer: School $B$ outperforms School $A$ is supported by the test in part (e) / higher mean. School $A$ outperforms School $B$ is supported by the test in part (h) / better added value.

IB Mathematics AHL 4.13 Non-linear regression. AI HL Paper 3- Exam Style Questions- New Syllabus

Question

Resources

Members

Company