IBDP MAI :Topic 4: Statistics and probability-AHL 4.18-Type I and II errors including calculations of their probabilities.Exam Style Questions Paper 3

Question

Aimmika is the manager of a grocery store in Nong Khai. She is carrying out a statistical analysis on the number of bags of rice that are sold in the store each day. She collects the following sample data by recording how many bags of rice the store sells each day over a period of 90 days.

She believes that her data follows a Poisson distribution.

Aimmika knows from her historic sales records that the store sells an average of 4.2 bags of rice each day. The following table shows the expected frequency of bags of rice sold each day during the 90 day period, assuming a Poisson distribution with mean 4.2 .

Aimmika decides to carry out a $\chi^2$ goodness of fit test at the $5 \%$ significance level to see whether the data follows a Poisson distribution with mean
4. 2 .

Aimmika claims that advertising in a local newspaper for 300 Thai Baht (THB) per day will increase the number of bags of rice sold. However, Nichakarn, the owner of the store, claims that the advertising will not increase the store’s overall profit.

Nichakarn agrees to advertise in the newspaper for the next 60 days. During that time, Aimmika records that the store sells 282 bags of rice with a profit of $495 \mathrm{THB}$ on each bag sold.

Aimmika wants to carry out an appropriate hypothesis test to determine whether the number of bags of rice sold during the 60 days increased when compared with the historic sales records.
a.i. Find the mean and variance for the sample data given in the table.
a.ii.Hence state why Aimmika believes her data follows a Poisson distribution.
b. State one assumption that Aimmika needs to make about the sales of bags of rice to support her belief that it follows a Poisson distribution.
c. Find the value of $a$, of $b$, and of $c$. Give your answers to 3 decimal places.
d.i. Write down the number of degrees of freedom for her test.
d.iiPerform the $\chi^2$ goodness of fit test and state, with reason, a conclusion.
e.i. By finding a critical value, perform this test at a $5 \%$ significance level.
e.ii.Hence state the probability of a Type I error for this test.
f. By considering the claims of both Aimmika and Nichakarn, explain whether the advertising was beneficial to the store.

▶️Answer/Explanation

a.i. mean $=4.23(4.23333 \ldots)$
A1
variance $=4.27(4.26777 \ldots)$
A1
[2 marks]
a.ii.mean is close to the variance
A1
[1 mark]
b. One of the following:
the number of bags sold each day is independent of any other day the sale of one bag is independent of any other bag sold the sales of bags of rice (each day) occur at a constant mean rate
A1
Note: Award A1 for a correct answer in context. Any statement referring to independence must refer to either the independence of each bag sold or the independence of the number of bags sold each day. If the third option is seen, the statement must refer to a “constant mean” or “constant average”. Do not accept “the number of bags sold each day is constant”.
[1 mark]
c. attempt to find Poisson probabilities and multiply by 90
(M1)
$a=7.018$
A1
$b=17.498$
A1
EITHER
$90 \times \mathrm{P}(X \geq 8)=90 \times(1-\mathrm{P}(X \leq 7)) \quad$ (M1)
$c=5.755 \quad$ A1

OR
$90-7.018-11.903-16.665-17.498-14.698-10.289-6.173$
(M1)
$c=5.756$
A1
Note: Do not penalize the omission of clear $a, b$ and $c$ labelling as this will be penalized later if correct values are interchanged.d.i. $7 \quad \boldsymbol{A 1}$
[1 mark]
d.ii. $\mathrm{H}_0$ : The number of bags of rice sold each day follows a Poisson distribution with mean 4.2.
A1
$\mathrm{H}_1$ : The number of bags of rice sold each day does not follow a Poisson distribution with mean 4.2.
A1
Note: Award A1A1 for both hypotheses correctly stated and in correct order. Award A1A0 if reference to the data and/or “mean 4.2” is not included in the hypotheses, but otherwise correct.
evidence of attempting to group data to obtain the observed frequencies for $\leq 1$ and $\geq 8$
(M1)
$p$-value $=0.728(0.728100 \ldots) \quad$ A2
$0.728(0.728100 \ldots)>0.05 \quad R 1$
the result is not significant so there is no reason to reject $\mathrm{H}_0$ (the number of bags sold each day follows a Poisson distribution)
A1
Note: Do not award ROA1. The conclusion MUST follow through from their hypotheses. If no hypotheses are stated, the final A1 can still be awarded for a correct conclusion as long as it is in context (e.g. therefore the data follows a Poisson distribution).
[7 marks]

e.i. METHOD 1
evidence of multiplying $4.2 \times 60$ (seen anywhere)
M1
$$
\begin{aligned}
& \mathrm{H}_0: \mu=252 \\
& \mathrm{H}_1: \mu>252
\end{aligned}
$$
A1
Note: Accept $\mathrm{H}_0: \mu=4.2$ and $\mathrm{H}_1: \mu>4.2$ for the $\boldsymbol{A 1}$.
evidence of finding probabilities around critical region
(M1)

Note: Award (M1) for any of these values seen:
$$
\begin{aligned}
& \mathrm{P}(X \geq 277)=0.0630518 \ldots \text { OR } \mathrm{P}(X \leq 276)=0.936948 \ldots \\
& \mathrm{P}(X \geq 278)=0.0558415 \ldots \text { OR } \mathrm{P}(X \leq 277)=0.944158 \ldots \\
& \mathrm{P}(X \geq 279)=0.0493055 \ldots \text { OR } \mathrm{P}(X \leq 278)=0.950694 \ldots
\end{aligned}
$$
critical value $=279$
A1
$282 \geq 279$,
R1
the null hypothesis is rejected
A1
(the advertising increased the number of bags sold during the 60 days)

Note: Do not award ROA1. Accept statements referring to the advertising being effective for $\boldsymbol{A 1}$ as long as the $\boldsymbol{R}$ mark is satisfied. For the $\boldsymbol{R 1 A 1}$, follow through within the part from their critical value.

METHOD 2
evidence of dividing 282 by 60 (or 4.7 seen anywhere)
M1
$\mathrm{H}_0: \mu=4.2$
$\mathrm{H}_1: \mu>4.2$
A1
attempt to find critical value using central limit theorem
(M1)
(e.g. sample standard deviation $=\sqrt{\frac{4.2}{60}}, X \sim N\left(4.2, \sqrt{\frac{4.2}{60}}\right)$, etc.)

Note: Award (M1) for a $p$-value of $0.0293907 \ldots$ seen.
critical value $=4.63518 \ldots$
A1

4. $7>4.63518 \ldots$
R1
the null hypothesis is rejected
A1
(the advertising increased the number of bags sold during the 60 days)

Note: Do not award ROA1. Accept statements referring to the advertising being effective for $\boldsymbol{A 1}$ as long as the $\mathbf{R}$ mark is satisfied. For the $\mathbf{R 1 A 1}$, follow through within the part from their critical value.
[6 marks]
e.ii.(P $(X \geq 279 \mu=252)=) 0.0493 \quad(0.0493055 \ldots)$
A1
Note: If a candidate uses METHOD 2 in part (e)(i), allow an FT answer of 0.05 for this part but only if the candidate has attempted to find a $p$-value.
[1 mark]
f. attempt to compare profit difference with cost of advertising
(M1)
Note: Award (M1) for evidence of candidate mathematically comparing a profit difference with the cost of the advertising.

EITHER
(comparing profit from 30 extra bags of rice with cost of advertising)
$$
14850<18000
$$

A1
A
OR
(comparing total profit with and without advertising)
$121590<124740$
A1

OR
(comparing increase of average daily profit with daily advertising cost)
$247.50<300$
A1

THEN
EITHER
Even though the number of bags of rice increased, the advertising is not worth it as the overall profit did not increase.
R1
OR

The advertising is worth it even though the cost is less than the increased profit, since the number of customers increased (possibly buying other products and/or returning in the future after advertising stops) $\quad \boldsymbol{R 1}$

Note: Follow through within the part for correct reasoning consistent with their comparison.
[3 marks]

 

Question

A systems analyst defines the following variables in a model:
– $t$ is the number of days since the first computer was infected by the virus.
– $Q(t)$ is the total number of computers that have been infected up to and including day $t$.

The following data were collected:

A model for the early stage of the spread of the computer virus suggests that
$$
Q^{\prime}(t)=\beta N Q(t)
$$
where $N$ is the total number of computers in a city and $\beta$ is a measure of how easily the virus is spreading between computers. Both $N$ and $\beta$ are assumed to be constant.

The data above are taken from city $\mathrm{X}$ which is estimated to have 2.6 million computers.
The analyst looks at data for another city, Y. These data indicate a value of $\beta=9.64 \times 10^{-8}$.

An estimate for $Q^{\prime}(t), t \geq 5$, can be found by using the formula:
$$
Q^{\prime}(t) \approx \frac{Q(t+5)-Q(t-5)}{10}
$$

The following table shows estimates of $Q^{\prime}(t)$ for city $\mathrm{X}$ at different values of $t$.

An improved model for $Q(t)$, which is valid for large values of $t$, is the logistic differential equation
$$
Q^{\prime}(t)=k Q(t)\left(1-\frac{Q(t)}{L}\right)
$$
where $k$ and $L$ are constants.
Based on this differential equation, the graph of $\frac{Q^{\prime}(t)}{Q(t)}$ against $Q(t)$ is predicted to be a straight line.
a.i. Find the equation of the regression line of $Q(t)$ on $t$.
a.ii.Write down the value of $r$, Pearson’s product-moment correlation coefficient.
a.iiiExplain why it would not be appropriate to conduct a hypothesis test on the value of $r$ found in (a)(ii).
b.i. Find the general solution of the differential equation $Q^{\prime}(t)=\beta N Q(t)$.
b.ii.Using the data in the table write down the equation for an appropriate non-linear regression model.
b.iiWrite down the value of $R^2$ for this model.
b.ivtence comment on the suitability of the model from (b)(ii) in comparison with the linear model found in part (a).
b.v.By considering large values of $t$ write down one criticism of the model found in (b)(ii).
c. Use your answer from part (b)(ii) to estimate the time taken for the number of infected computers to double.
d. Find in which city, $\mathrm{X}$ or $\mathrm{Y}$, the computer virus is spreading more easily. Justify your answer using your results from part (b).
e. Determine the value of $a$ and of $b$. Give your answers correct to one decimal place.
f.i. Use linear regression to estimate the value of $k$ and of $L$.
f.ii. The solution to the differential equation is given by
$$
Q(t)=\frac{L}{1+C \mathrm{e}^{-k t}}
$$
where $C$ is a constant.
Using your answer to part (f)(i), estimate the percentage of computers in city $X$ that are expected to have been infected by the virus over a long period of time.

▶️Answer/Explanation

a.i. $Q(t)=3090 t-54000(3094.27 \ldots t-54042.3 \ldots)$
A1A1
Note: Award at most A1AO if answer is not an equation. Award A1AO for an answer including either $x$ or $y$.
[2 marks]
a.ii.0. $755(0.754741 \ldots)$
A1
[1 mark]
a.iiit is not a random variable OR it is not a (bivariate) normal distribution
OR data is not a sample from a population
OR data appears nonlinear
OR $r$ only measures linear correlation
R1
Note: Do not accept ” $r$ is not large enough”.
[1 mark]
b.i.attempt to separate variables
(M1)
$$
\begin{aligned}
& \int \frac{1}{Q} \mathrm{~d} Q=\int \beta N \mathrm{~d} t \\
& \ln |Q|=\beta N t+c
\end{aligned}
$$

A1A1A1

Note: Award $\boldsymbol{A 1}$ for LHS, $\boldsymbol{A 1}$ for $\beta N t$, and $\boldsymbol{A 1}$ for $+c$.
Award full marks for $Q=\mathrm{e}^{\beta N t+c}$ OR $Q=A \mathrm{e}^{\beta N t}$.
Award M1A1A1A0 for $Q=\mathrm{e}^{\beta N t}$
[4 marks]

b.iiattempt at exponential regression
(M1)
$$
Q=1.15 \mathrm{e}^{0.292 t}\left(Q=1.14864 \ldots \mathrm{e}^{0.292055 \ldots t}\right)
$$
A1
OR
attempt at exponential regression
(M1)
$$
Q=1.15 \times 1.34^t\left(1.14864 \ldots \times 1.33917 \ldots{ }^t\right)
$$
A1
Note: Condone answers involving $y$ or $x$. Condone absence of ” $Q=$ ” Award M1AO for an incorrect answer in correct format.
[2 marks]
b.iiio. $999(0.999431 \ldots)$
A1
[1 mark]
b.ivcomparing something to do with $R^2$ and something to do with $r \quad \boldsymbol{M 1}$
Note: Examples of where the $\boldsymbol{M 1}$ should be awarded:
$$
\begin{aligned}
& R^2>r \\
& R>r \\
& 0.999>0.755 \\
& 0.999>0.755^2 \quad(=0.563)
\end{aligned}
$$

The “correlation coefficient” in the exponential model is larger.
Model B has a larger $R^2$
Examples of where the $\boldsymbol{M} \mathbf{1}$ should not be awarded:
The exponential model shows better correlation (since not clear how it is being measured)
Model 2 has a better fit
Model 2 is more correlated
an unambiguous comparison between $R^2$ and $r^2$ or $R$ and $r$ leading to the conclusion that the model in part (b) is more suitable / better

Note: Condone candidates claiming that $R$ is the “correlation coefficient” for the non-linear model.
[2 marks]

b.vit suggests that there will be more infected computers than the entire population
R1
Note: Accept any response that recognizes unlimited growth.
[1 mark]
c. $1.15 \mathrm{e}^{0.292 t}=2.3$ OR $1.15 \times 1.34^t=2.3$ OR $t=\frac{\ln 2}{0.292}$ OR using the model to find two specific times with values of $Q(t)$ which double
M1
$t=2.37$ (days)
A1
Note: Do not $\boldsymbol{F T}$ from a model which is not exponential. Award MOAO for an answer of 2.13 which comes from using $(10,20)$ from the data or any other answer which finds a doubling time from figures given in the table.
[2 marks]
d. an attempt to calculate $\beta$ for city $\mathrm{X}$
(M1)
$$
\begin{aligned}
\beta & =\frac{0.292055 \ldots}{2.6 \times 10^6} \text { OR } \beta=\frac{\ln 1.33917 \ldots}{2.6 \times 10^6} \\
& =1.12328 \ldots \times 10^{-7} \quad \text { A1 }
\end{aligned}
$$
A1
this is larger than $9.64 \times 10^{-8}$ so the virus spreads more easily in city $X$
R1
Note: It is possible to award M1AOR1.
Condone “so the virus spreads faster in city X” for the final $\boldsymbol{R 1}$.
[3 marks]
e. $a=38.3, b=3086.1$
A1A1
Note: Award A1AO if values are correct but not to $1 \mathrm{dp}$.
[2 marks]
f.i. $\frac{Q^{\prime}}{Q}=0.42228-2.5561 \times 10^{-6} Q$
(A1)(A1)
Note: Award A1 for each coefficient seen – not necessarily in the equation. Do not penalize seeing in the context of $y$ and $x$.

identifying that the constant is $k$ OR that the gradient is $-\frac{k}{L}$
(M1)
therefore $k=0.422(0.422228 \ldots)$
A1
$$
\begin{aligned}
& \frac{k}{L}=2.5561 \times 10^{-6} \\
& L=165000(165205)
\end{aligned}
$$
A1
Note: Accept a value of $L$ of 164843 from use of 3 sf value of $k$, or any other value from plausible pre-rounding. Allow follow-through within the question part, from the equation of their line to the final two A1 marks.
[5 marks]
f.ii. recognizing that their $L$ is the eventual number of infected
(M1)
$$
\frac{165205 \ldots}{2600000}=6.35 \%
$$
(6. $35403 \ldots \%)$
A1

Note: Accept any final answer consistent with their answer to part (f)(i) unless their $L$ is less than 120146 in which case award at most $\mathbf{M 1 A O}$.
[2 marks]

Question

A random variable $X$ has a distribution with mean $\mu$ and variance 4. A random sample of size 100 is to be taken from the distribution of $X$.

Josie takes a different random sample of size 100 to test the null hypothesis that $\mu=60$ against the alternative hypothesis that $\mu>60$ at the $5 \%$ level.
a. State the central limit theorem as applied to a random sample of size $n$, taken from a distribution with mean $\mu$ and variance $\sigma^2$.
b. Jack takes a random sample of size 100 and calculates that $\bar{x}=60$.2. Find an approximate $90 \%$ confidence interval for $\mu$.
c.i. Find the critical region for Josie’s test, giving your answer correct to two decimal places.
c.ii.Write down the probability that Josie makes a Type I error.
c.iiiGiven that the probability that Josie makes a Type II error is 0.25 , find the value of $\mu$, giving your answer correct to three significant figures.

▶️Answer/Explanation

a. for $n$ (sufficiently) large the sample mean $\bar{X}$ approximately
A1
$\sim \mathrm{N}\left(\mu, \frac{\sigma^2}{n}\right)$
A1

Note: Award the first $\boldsymbol{A 1}$ for $n$ large and reference to the sample mean $(\bar{X})$, the second $\boldsymbol{A 1}$ is for normal and the two parameters.
Note: Award the second $\mathbf{A 1}$ only if the first $\mathbf{A 1}$ is awarded.
Note: Allow ‘ $n$ tends to infinity’ or ‘ $n \geq 30$ ‘ in place of ‘large’.
[2 marks]
b. $[59.9,60.5]$
A1A1
Note: Accept answers which round to the correct 3sf answers.
[2 marks]
c.i. under $H_0, \bar{X} \sim \mathrm{N}\left(60, \frac{4}{100}\right)$
(A1)
required to find $k$ such that $P(\bar{X}>k)=0.05$
(M1)
use of any valid method, eg GDC $\operatorname{lnv}\left(\right.$ Normal) or $k=60+z \frac{\sigma}{\sqrt{n}}$
(M1)
hence critical region is $\bar{x}=60.33$
A1
[4 marks]
c.ii. 0.05
A1
[1 mark]
c.iiiP(Type II error $)=\mathrm{P}\left(H_0\right.$ is accepted $/ H_0$ is false $)$
(R1)
Note: Accept Type II error means $H_0$ is accepted given $H_0$ is false.
$$
\begin{aligned}
& \Rightarrow \mathrm{P}(\bar{X}<60.33)=0.25 \text { when } \bar{X} \sim \mathrm{N}\left(\mu, \frac{4}{100}\right) \quad \text { (M1) } \\
& \Rightarrow \mathrm{P}\left(\frac{\bar{X}-\mu}{\frac{2}{10}}<\frac{60.33-\mu}{\frac{2}{10}}\right)=0.25 \quad \text { (M1) } \\
& \Rightarrow \mathrm{P}\left(Z<\frac{60.33-\mu}{\frac{2}{10}}\right)=0.25 \text { where } Z \sim \mathrm{N}\left(0,1^2\right) \\
& \frac{60.33-\mu}{\frac{2}{10}}=-0.6744 \ldots \quad \text { (A1) } \\
& \mu=60.33+\frac{2}{10} \times 0.6744 \ldots \\
& \mu=60.5 \quad \text { A1 }
\end{aligned}
$$
A1
[5 marks]

Question

A firm wishes to review its recruitment processes. This question considers the validity and reliability of the methods used.

Every year an accountancy firm recruits new employees for a trial period of one year from a large group of applicants.
At the start, all applicants are interviewed and given a rating. Those with a rating of either Excellent, Very good or Good are recruited for the trial period. At the end of this period, some of the new employees will stay with the firm.

It is decided to test how valid the interview rating is as a way of predicting which of the new employees will stay with the firm.

Data is collected and recorded in a contingency table.

The next year’s group of applicants are asked to complete a written assessment which is then analysed. From those recruited as new employees, a random sample of size 18 is selected.

The sample is stratified by department. Of the 91 new employees recruited that year, 55 were placed in the national department and 36 in the international department.

At the end of their first year, the level of performance of each of the 18 employees in the sample is assessed by their department manager. They are awarded a score between 1 (low performance) and 10 (high performance).

The marks in the written assessment and the scores given by the managers are shown in both the table and the scatter diagram.

The firm decides to find a Spearman’s rank correlation coefficient, $r_s$, for this data.

The same seven employees are given the written assessment a second time, at the end of the first year, to measure its reliability. Their marks are shown in the table below.

The written assessment is in five sections, numbered 1 to 5 . At the end of the year, the employees are also given a score for each of five professional attributes: V, W, X, Y and Z.

The firm decides to test the hypothesis that there is a correlation between the mark in a section and the score for an attribute.
They compare marks in each of the sections with scores for each of the attributes.
a. Use an appropriate test, at the $5 \%$ significance level, to determine whether a new employee staying with the firm is independent of their interview rating. State the null and alternative hypotheses, the $p$-value and the conclusion of the test.
b. Show that 11 employees are selected for the sample from the national department.
c.i. Without calculation, explain why it might not be appropriate to calculate a correlation coefficient for the whole sample of 18 employees.
c.ii.Find $r_s$ for the seven employees working in the international department.
c.iiiHence comment on the validity of the written assessment as a measure of the level of performance of employees in this department. Justify your answer.
d.i. State the name of this type of test for reliability.
d.iiFor the data in this table, test the null hypothesis, $\mathrm{H}_0: \rho=0$, against the alternative hypothesis, $\mathrm{H}_1: \rho>0$, at the $5 \%$ significance level. You may assume that all the requirements for carrying out the test have been met.
d.iiilence comment on the reliability of the written assessment.
e.i. Write down the number of tests they carry out.
e.ii.The tests are performed at the $5 \%$ significance level.
Assuming that:
– there is no correlation between the marks in any of the sections and scores in any of the attributes,
– the outcome of each hypothesis test is independent of the outcome of the other hypothesis tests, find the probability that at least one of the tests will be significant.
e.iiiThe firm obtains a significant result when comparing section 2 of the written assessment and attribute X. Interpret this result.

▶️Answer/Explanation

a. Use of $\chi^2$ test for independence
(M1)
$\mathrm{H}_0$ : Staying (or leaving) the firm and interview rating are independent.
$\mathrm{H}_1$ : Staying (or leaving) the firm and interview rating are not independent
A1
Note: For $\mathrm{H}_1$ accept ‘…are dependent’ in place of ‘…not independent’.
$p$-value $=0.487(0.487221 \ldots)$
A2

Note: Award $\mathbf{A 1}$ for $\chi^2=1.438 \ldots$ if $p$-value is omitted or incorrect.
$0.487>0.05$
R1
(the result is not significant at the $5 \%$ level)
insufficient evidence to reject the $\mathrm{H}_0$ (or “accept $\mathrm{H}_0$ “)
A1

Note: Do not award ROA1. The final R1A1 can follow through from their incorrect $p$-value
[6 marks]
b. $\frac{55}{91} \times 18=10.9(10.8791 \ldots)$
M1A1
Note: Award $\mathbf{A 1}$ for anything that rounds to 10.9.
$\approx 11$
AG
[2 marks]
c.i. there seems to be a difference between the two departments
(A1)
the international department manager seems to be less generous than the national department manager
R1
Note: The $\boldsymbol{A 1}$ is for commenting there is a difference between the two departments and the $\boldsymbol{R} \mathbf{1}$ is for correctly commenting on the direction of the difference
[2 marks]

c.ii.

Note: Award (M1) for an attempt to rank the data, and (A1) for correct ranks for both variables. Accept either set of rankings in reverse.
$$
r_s=0.909(0.909241 \ldots)
$$
(M1)(A1)

Note: The (M1) is for calculating the PMCC for their ranks.
Note: If a final answer of 0.9107 is seen, from use of $1-\frac{6 \Sigma d^2}{n\left(n^2-1\right)}$, award (M1)(A1)A1.
Accept -0.909 if one set of ranks has been ordered in reverse.
[4 marks]
c.iiiEITHER
there is a (strong) association between the written assessment mark and the manager scores.
A1
OR
there is a (strong) agreement in the rank order of the written assessment marks and the rank order of the manager scores.
A1
OR
there is a (strong linear) correlation between the rank order of the written assessment marks and the rank order of the manager scores.
A1
Note: Follow through on a value for their value of $r_s$ in c(ii).

THEN
the written assessment is likely to be a valid measure (of the level of employee performance)
R1
[2 marks]
d.i.test-retest
A1
[1 mark]
d.ii. $p$-value $=0.00209(0.0020939 \ldots)$
A2
$0.00209<0.05$
R1
(the result is significant at the $5 \%$ level)
(there is sufficient evidence to) reject $\mathrm{H}_0$
A1
Note: Do not award R0A1. Accept “accept $\mathrm{H}_1$ “. The final R1A1 can follow through from their incorrect $p$-value.

d.iithe test seems reliable
A1
Note: Follow through from their answer in part (d)(ii). Do not award if there is no conclusion in d(ii).
[1 mark]
e.i. 25
A1
[1 mark]
e.ii.probability of significant result given no correlation is 0.05
(M1)
probability of at least one significant result in 25 tests is
$1-0.95^{25}$
(M1)(A1)
Note: Award (M1) for use of $1-\mathrm{P}(0)$ or the binomial distribution with any value of $p$.
$=0.723(0.722610 \ldots)$
A1
[4 marks]
e.iii(though the result is significant) it is very likely that one significant result would be achieved by chance, so it should be disregarded or further evidence sought
R1
[1 mark]

Question

Juliet is a sociologist who wants to investigate if income affects happiness amongst doctors. This question asks you to review Juliet’s methods and conclusions.

Juliet obtained a list of email addresses of doctors who work in her city. She contacted them and asked them to fill in an anonymous questionnaire. Participants were asked to state their annual income and to respond to a set of questions. The responses were used to determine a happiness score out of 100 . Of the 415 doctors on the list, 11 replied.

Juliet’s results are summarized in the following table.

For the remaining ten responses in the table, Juliet calculates the mean happiness score to be 52.5.

Juliet decides to carry out a hypothesis test on the correlation coefficient to investigate whether increased annual income is associated with greater happiness.

Juliet wants to create a model to predict how changing annual income might affect happiness scores. To do this, she assumes that annual income in dollars, $X$, is the independent variable and the happiness score, $Y$, is the dependent variable.

She first considers a linear model of the form
$$
Y=a X+b .
$$

Juliet then considers a quadratic model of the form
$$
Y=c X^2+d X+e .
$$

After presenting the results of her investigation, a colleague questions whether Juliet’s sample is representative of all doctors in the city.

A report states that the mean annual income of doctors in the city is $\$ 80000$. Juliet decides to carry out a test to determine whether her sample could realistically be taken from a population with a mean of $\$ 80000$.
a.i. Describe one way in which Juliet could improve the reliability of her investigation.
a.ii.Describe one criticism that can be made about the validity of Juliet’s investigation.
b. Juliet classifies response $\mathrm{K}$ as an outlier and removes it from the data. Suggest one possible justification for her decision to remove it.
c.i. Calculate the mean annual income for these remaining responses.
c.ii.Determine the value of $r$, Pearson’s product-moment correlation coefficient, for these remaining responses.
d.i.State why the hypothesis test should be one-tailed.
d.iistate the null and alternative hypotheses for this test.
d.iiiThe critical value for this test, at the $5 \%$ significance level, is 0.549 . Juliet assumes that the population is bivariate normal.
Determine whether there is significant evidence of a positive correlation between annual income and happiness. Justify your answer.

e.i. Use Juliet’s data to find the value of $a$ and of $b$.
e.ii.Interpret, referring to income and happiness, what the value of $a$ represents.
e.iiiFind the value of $c$, of $d$ and of $e$.
e.ivFind the coefficient of determination for each of the two models she considers.
e.v.Hence compare the two models.
e.viJuliet decides to use the coefficient of determination to choose between these two models.
Comment on the validity of her decision.
f.i. State the name of the test which Juliet should use.
f.ii. State the null and alternative hypotheses for this test.
f.iii.Perform the test, using a $5 \%$ significance level, and state your conclusion in context.

▶️Answer/Explanation

a.i. Any one from:
R1
increase sample size / increase response rate / repeat process check whether sample is representative test-retest participants or do a parallel test
use a stratified sample
use a random sample

Note: Do not condone:
Ask different types of doctor
Ask for proof of income
Ask for proof of being a doctor
Remove anonymity
Remove response K.
[1 mark]
a.ii.Any one from:
R1
non-random sampling means a subset of population might be responding self-reported happiness is not the same as happiness happiness is not a constant / cannot be quantified / is difficult to measure income might include external sources
Juliet is only sampling doctors in her city correlation does not imply causation
sample might be biased

Note: Do not condone the following common but vague responses unless they make a clear link to validity: Sample size is too small
Result is not generalizable
There may be other variables Juliet is ignoring
Sample might not be representative
[1 mark]

b. because the income is very different / implausible / clearly contrived
R1
Note: Answers must explicitly reference “income” to get credit.
[1 mark]
c.i. (\$) 90200
(M1)A1
[2 marks]
c.ii. $\boldsymbol{r}=\mathbf{0} .558(0.557723 \ldots)$
A2
[2 marks]
d.i.EITHER
only looking for change in one direction
R1
OR
only looking for greater happiness with greater income
R1
OR
only looking for evidence of positive correlation
R1
[1 mark]
d.ii. $\mathrm{H}_0: \rho=0 ; \mathrm{H}_1: \rho>0$
A1A1
Note: Award $\mathbf{A 1}$ for $\rho$ seen (do not accept $r$ ), A1 for both correct hypotheses, using their $\rho$ or $r$. Accept an equivalent statement in words, however reference to “correlation for the population” or “association for the population” must be explicit for the first $\boldsymbol{A 1}$ to be awarded.
Watch out for a null hypothesis in words similar to “Annual income is not associated with greater happiness”. This is effectively saying $\rho \leq 0$ and should not be condoned.
[2 marks]

d.iiiMETHOD 1 – using critical value of $r$
$0.558>0.549(0.557723 \ldots>0.549) \quad \boldsymbol{R 1}$
(therefore significant evidence of) a positive correlation
A1
Note: Do not award R0A1.

METHOD 2 – using $p$-value
$0.0469<0.05(0.0469463 \ldots<0.05)$
A1
Note: Follow through from their $r$-value from part (c)(ii).
(therefore significant evidence of) a positive correlation
A1
Note: Do not award AOA1.
[2 marks]
e.i. $a=0.000126(0.000125842 \ldots), \quad b=41.1(41.1490 \ldots)$
A1
[1 mark]
e.i.EITHER
the amount the happiness score increases for every $\$ 1$ increase in (annual) income
A1
OR
rate of change of happiness with respect to (annual) income
A1
Note: Accept equivalent responses e.g. an increase of 1.26 in happiness for every $\$ 10000$ increase in salary.
[1 mark]

$$
\begin{aligned}
\text { e.iii } & =-2.06 \times 10^{-9}\left(-2.06191 \ldots \times 10^{-9}\right) \\
d & =7.05 \times 10^{-4}\left(7.05272 \ldots \times 10^{-4}\right) \\
e & =12.6(12.5878 \ldots) \quad \boldsymbol{A 1}
\end{aligned}
$$
A1
[1 mark]
e.ivfor quadratic model: $R^2=0.659(0.659145 \ldots)$
A1
for linear model: $R^2=0.311(0.311056 \ldots)$
A1
Note: Follow through from their $r$ value from part (c)(ii).
[2 marks]
e.v.EITHER
quadratic model is a better fit to the data / more accurate
A1
OR
quadratic model explains a higher proportion of the variance
A1
[1 mark]
e.viEITHER
not valid, $R^2$ not a useful measure to compare models with different numbers of parameters
A1
OR
not valid, quadratic model will always have a better fit than a linear model
A1
Note: Accept any other sensible critique of the validity of the method. Do not accept any answers which focus on the conclusion rather than the method of model selection.
[1 mark]
f.i. (single sample) $t$-test
A1
[1 mark]

f.ii. EITHER
$$
\mathrm{H}_0: \mu=80000 ; \mathrm{H}_1: \mu \neq 80000
$$
A1
OR
$\mathrm{H}_0$ : (sample is drawn from a population where) the population mean is $\$ 80000$
$\mathrm{H}_1$ : the population mean is not $\$ 80000$
A1
Note: Do not allow FT from an incorrect test in part (f)(i) other than a $z$-test.
[1 mark]
$$
\text { f.iii. } p=0.610(0.610322 \ldots)
$$
A1
Note: For a $z$-test follow through from part (f)(i), either 0.578 (from biased estimate of variance) or 0.598 (from unbiased estimate of variance).
$0.610>0.05$
R1
EITHER
no (significant) evidence that mean differs from $\$ 80000$
A1
OR
the sample could plausibly have been drawn from the quoted population
A1
Note: Allow R1FTA1FT from an incorrect $p$-value, but the final A1 must still be in the context of the original research question.
[3 marks]

 

Question

A shop sells carrots and broccoli. The weights of carrots can be modelled by a normal distribution with variance 25 grams ${ }^2$ and the weights of broccoli can be modelled by a normal distribution with variance 80 grams $^2$. The shopkeeper claims that the mean weight of carrots is 130 grams and the mean weight of broccoli is 400 grams.

Dong Wook decides to investigate the shopkeeper’s claim that the mean weight of carrots is 130 grams. He plans to take a random sample of $n$ carrots in order to calculate a $98 \%$ confidence interval for the population mean weight.

Anjali thinks the mean weight, $\mu$ grams, of the broccoli is less than 400 grams. She decides to perform a hypothesis test, using a random sample of size 8. Her hypotheses are
$$
H_0: \mu=400 ; H_1: \mu<400 \text {. }
$$

She decides to reject $H_0$ if the sample mean is less than 395 grams.
a. Assuming that the shopkeeper’s claim is correct, find the probability that the weight of six randomly chosen carrots is more than two times the weight of one randomly chosen broccoli.
b. Find the least value of $n$ required to ensure that the width of the confidence interval is less than 2 grams.
c. Find the significance level for this test.
d. Given that the weights of the broccoli actually follow a normal distribution with mean 392 grams and variance 80 grams ${ }^2$, find the probability of [3] Anjali making a Type II error.

▶️Answer/Explanation

a.

Let $X=\sum_{i=1}^6 C_i-2 B \quad$ M1
$$
\begin{aligned}
& \mathrm{E}(X)=6 \times 130-2 \times 400=-20 \quad \text { (M1)(A1) } \\
& \operatorname{Var}(X)=6 \times 25+4 \times 80=470 \quad \text { (M1)(A1) } \\
& \mathrm{P}(X>0)=0.178 \quad \text { A1 }
\end{aligned}
$$
A1
Note: Condone the notation $6 C-2 B$ only if the (M1) is awarded for the variance.
[6 marks]
b. $z=2.326 \ldots$
(A1)
$$
\begin{aligned}
& \frac{2 z \sigma}{\sqrt{n}}<2 \quad \text { M1 } \\
& \sqrt{n}>11.6 \ldots \\
& n>135.2 \ldots \\
& n=136 \quad \text { A1 }
\end{aligned}
$$

Note: Condone the use of equal signs.
[3 marks]
c. variance $=\frac{80}{8}=10$
under $H_0, \bar{B} \sim \mathrm{N}(400,10)$
significance level $=\mathrm{P}(\bar{B}<395)$
(M1)
$=0.0569$ or $5.69 \%$
A1
Note: Accept any answer that rounds to 0.057 or $5.7 \%$.
[3 marksd.
$$
\begin{aligned}
\text { Type II error probability } & =\mathrm{P}\left(\text { Accept } H_0 H_1 \text { true }\right) \quad \text { (M1) } \\
& =\mathrm{P}(\bar{B}>395 \bar{B} \approx N(392,10)) \quad \text { (A1) } \\
& =0.171 \quad \text { A1 }
\end{aligned}
$$
(M1)
Note: Accept any answer that rounds to 0.17 .
[3 marks]

 

Question

Peter, the Principal of a college, believes that there is an association between the score in a Mathematics test, $X$, and the time taken to run $500 \mathrm{~m}, Y$ seconds, of his students. The following paired data are collected.

It can be assumed that $(X, Y)$ follow a bivariate normal distribution with product moment correlation coefficient $\rho$.
a.i. State suitable hypotheses $H_0$ and $H_1$ to test Peter’s claim, using a two-tailed test.
a.ii.Carry out a suitable test at the $5 \%$ significance level. With reference to the $p$-value, state your conclusion in the context of Peter’s claim.
b. Peter uses the regression line of $y$ on $x$ as $y=0.248 x+83.0$ and calculates that a student with a Mathematics test score of 73 will have a running time of 101 seconds. Comment on the validity of his calculation.

▶️Answer/Explanation

a.i. $H_0: \rho=0 \quad H_1: \rho \neq 0 \quad \boldsymbol{A 1}$
Note: It must be $\rho$.
[1 mark]
a.ii. $p=0.649$
A2
Note: Accept anything that rounds to 0.65
$0.649>0.05$
R1
hence, we accept $H_0$ and conclude that Peter’s claim is wrong
A1
Note: The $\boldsymbol{A}$ mark depends on the $\boldsymbol{R}$ mark and the answer must be given in context. Follow through the $p$-value in part (b).
[4 marks]
b. a statement along along the lines of ‘(we have accepted that) the two variables are independent’ or ‘the two variables are weakly correlated’
R1
a statement along the lines of ‘the use of the regression line is invalid’ or ‘it would give an inaccurate result’
R1
Note: Award the second $\boldsymbol{R} \mathbf{1}$ only if the first $\boldsymbol{R} \mathbf{1}$ is awarded.
Note: FT the conclusion in(a)(ii). If a candidate concludes that the claim is correct, mark as follows: (as we have accepted $\mathrm{H}_1$ ) the 2 variables are dependent and 73 lies in the range of $x$ values $\boldsymbol{R 1}$, hence the use of the regression line is valid $\boldsymbol{R} \mathbf{1}$.
[2 marks]

Question

In a large population of hens, the weight of a hen is normally distributed with mean $\mu \mathrm{kg}$ and standard deviation $\sigma \mathrm{kg}$. A random sample of 100 hens is taken from the population.

The mean weight for the sample is denoted by $\bar{X}$.

The sample values are summarized by $\sum x=199.8$ and $\sum x^2=407.8$ where $x \mathrm{~kg}$ is the weight of a hen.

It is found that $\sigma=0.27$. It is decided to test, at the $1 \%$ level of significance, the null hypothesis $\mu=1.95$ against the alternative hypothesis $\mu>1.95$
a. State the distribution of $\bar{X}$ giving its mean and variance.
b. Find an unbiased estimate for $\mu$.
c. Find an unbiased estimate for $\sigma^2$.
d. Find a $90 \%$ confidence interval for $\mu$.
e.i. Find the $p$-value for the test.
e.ii.Write down the conclusion reached.

▶️Answer/Explanation

a.

$$
\bar{X} \sim N\left(\mu, \frac{\sigma^2}{100}\right) \quad \boldsymbol{A 1}
$$

Note: Accept $n$ in place of 100 .
[1 mark]
b. $\hat{\mu}=\frac{\sum x}{n}=\frac{199.8}{100}=1.998$
A1
Note: Accept 2.00, 2.0 and 2.
[1 mark]
c.
$$
\begin{aligned}
& s_{n-1}^2=\frac{n}{n-1}\left(\frac{\sum x^2}{n}-\bar{x}^2\right)=\frac{100}{99}\left(\frac{407.8}{100}-1.998^2\right) \\
& =0.086864
\end{aligned}
$$
(M1)
unbiased estimate for $\sigma^2$ is 0.0869
A1
Note: Accept any answer which rounds to 0.087 .
[2 marks]
d. $90 \%$ confidence interval is $1.998 \pm 1.660 \sqrt{\frac{0.0869}{100}} \quad$ (M1) $=(1.95,2.05) \quad$ A1A1
Note: $\boldsymbol{F T}$ their $\sigma$ from (c).
Note: Condone the use of the $z$-value 1.645 since $n$ is large.
Note: Accept any values that round to 1.95 and 2.05 .
[3 marks]
e.i. $p$-value is 0.0377
A2
Note: Award A1 for the 2-tail value 0.0754 .
Note: Award A2 for 0.0377 and $\mathbf{A 1}$ for any other value that rounds to 0.038 .
Note: $\boldsymbol{F T}$ their estimated mean from (b), note that 2 gives $p=0.032(0)$.
[2 marks]
e.ii.accept the null hypothesis
A1
Note: $\boldsymbol{F T}$ their $p$-value.
[1 mark]

Question

The random variables $U, V$ follow a bivariate normal distribution with product moment correlation coefficient $\rho$.

A random sample of 12 observations on $U, V$ is obtained to determine whether there is a correlation between $U$ and $V$. The sample product moment correlation coefficient is denoted by $r$. A test to determine whether or not $U, V$ are independent is carried out at the $1 \%$ level of significance.
a. State suitable hypotheses to investigate whether or not $U, V$ are independent.
b. Find the least value of $|r|$ for which the test concludes that $\rho \neq 0$.

▶️Answer/Explanation

a.$\mathrm{H}_0: \rho=0 ; \mathrm{H}_1: \rho \neq 0 \quad$ A1A1
[2 marks]
b. $\nu=10$
(A1)
$t_{0.005}=3.16927 \ldots \quad$ (M1)(A1)
we reject $\mathrm{H}_0: \rho=0$ if $|t|>3.16927 \ldots \quad$ (R1)
attempting to solve $|r| \sqrt{\frac{10}{1-r^2}}>3.16927 \ldots$ for $|r| \quad$ M1
Note: Allow $=$ instead of $>$.
(least value of $|r|$ is) 0.708 (3 sf) $\quad \boldsymbol{A 1}$
Note: Award A1M1AOR1M1AO to candidates who use a one-tailed test. Award AOM1AOR1M1AO to candidates who use an incorrect number of degrees of freedom or both a one-tailed test and incorrect degrees of freedom.
Note: Possible errors are
10 DF 1-tail, $t=2.763 \ldots$, least value $=0.658$
11 DF 2-tail, $t=3.105 \ldots$, least value $=0.684$
11 DF 1-tail, $t=2.718 \ldots$, least value $=0.634$.
[6 marks]

Question

Mr Sailor owns a fish farm and he claims that the weights of the fish in one of his lakes have a mean of 550 grams and standard deviation of 8 grams.

Assume that the weights of the fish are normally distributed and that Mr Sailor’s claim is true.

Kathy is suspicious of Mr Sailor’s claim about the mean and standard deviation of the weights of the fish. She collects a random sample of fish from this lake whose weights are shown in the following table.

Using these data, test at the $5 \%$ significance level the null hypothesis $H_0: \mu=550$ against the alternative hypothesis $H_1: \mu<550$, where $\mu$ grams is the population mean weight.

Kathy decides to use the same fish sample to test at the $5 \%$ significance level whether or not there is a positive association between the weights and the lengths of the fish in the lake. The following table shows the lengths of the fish in the sample. The lengths of the fish can be assumed to be normally distributed.

a.i. Find the probability that a fish from this lake will have a weight of more than 560 grams.
a.ii.The maximum weight a hand net can hold is $6 \mathrm{~kg}$. Find the probability that a catch of 11 fish can be carried in the hand net.
b.i. State the distribution of your test statistic, including the parameter.
b.iiFind the $p$-value for the test.
b.iiiState the conclusion of the test, justifying your answer.
c.i. State suitable hypotheses for the test.
c.ii.Find the product-moment correlation coefficient $r$.
c.iiiState the $p$-value and interpret it in this context.
d. Use an appropriate regression line to estimate the weight of a fish with length $360 \mathrm{~mm}$.

▶️Answer/Explanation

a.i. Note: Accept all answers that round to the correct 2 sf answer in (a), (b) and (c) but not in (d).
$$
\begin{aligned}
& X \sim N\left(550,8^2\right) \quad \text { (M1) } \\
& \mathrm{P}(X>560)-0.10564 \ldots=0.106
\end{aligned}
$$
(M1)
A1
[2 marks]
a.ii.Note: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$$
\begin{aligned}
& X_i \sim N\left(550,8^2\right), i=1, \ldots, 11 \\
& \text { let } Y=\sum_{i=1}^{11} X_i \\
& \mathrm{E}(\mathrm{Y})=11 \times 550(6050) \quad \text { A1 } \\
& \operatorname{Var}(\mathrm{Y})=11 \times 8^2 \quad(704) \quad \text { (M1)A1 } \\
& \mathrm{P}(Y \leqslant 6000)=0.02975 \ldots=0.0298
\end{aligned}
$$
A1
[4 marks]
b.i. Note: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$t$ distribution with 7 degrees of freedom
A1A1
[2 marks]
b.ii.Note: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$$
p=0.25779 \ldots=0.258
$$
A2
[2 marks]

b.iiiNote: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$p>0.05$
R1
therefore we conclude that there is no evidence to reject $H_0$
A1
Note: $F T$ their $p$-value.
Note: Only award $\mathbf{A 1}$ if $\boldsymbol{R 1}$ awarded.
[2 marks]
c.i. Note: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$$
H_0: \rho=0, H_1: \rho>0
$$
A1
Note: Do not accept $r$ in place of $\rho$.
[1 mark]
c.ii.Note: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$$
r=0.782 \quad \text { A2 }
$$
[2 marks]
c.iiiNote: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$$
\begin{aligned}
& 0.01095 \ldots=0.0110 \\
& \text { since } 0.0110<0.05
\end{aligned}
$$
A1
R1
there is positive association between weight and length
A1
Note: $F T$ their p-value.
Note: Only award $A 1$ if $R \mathbf{1}$ awarded.
Note: Conclusion must be in context.
[3 marks]

d. Note: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
regression line of $y$ (weight) on $x$ (length) is (M1)
$$
\begin{aligned}
& y=0.8267 \ldots x+255.96 \ldots \\
& x=360 \text { gives } y=554
\end{aligned}
$$
A1
Note: Award M1AOAO for the wrong regression line, that is $y=0.7393 \ldots x-51.62 \ldots$
[3 marks]

Question

This question explores methods to analyse the scores in an exam.

A random sample of 149 scores for a university exam are given in the table.

The university wants to know if the scores follow a normal distribution, with the mean and variance found in part (a).

The expected frequencies are given in the table.

The university assigns a pass grade to students whose scores are in the top $80 \%$.

The university also wants to know if the exam is gender neutral. They obtain random samples of scores for male and female students. The mean, sample variance and sample size are shown in the table.

The university awards a distinction to students who achieve high scores in the exam. Typically, $15 \%$ of students achieve a distinction. A new exam is trialed with a random selection of students on the course. 5 out of 20 students achieve a distinction.

A different exam is trialed with 16 students. Let $p$ be the percentage of students achieving a distinction. It is desired to test the hypotheses
$$
H_0: p=0.15 \text { against } H_1: p>0.15
$$

It is decided to reject the null hypothesis if the number of students achieving a distinction is greater than 3.
a.i. Find unbiased estimates for the population mean.
a.ii.Find unbiased estimates for the population Variance.
b. Show that the expected frequency for $20<x \leq 4$ is 31.5 correct to 1 decimal place.
c. Perform a suitable test, at the $5 \%$ significance level, to determine if the scores follow a normal distribution, with the mean and variance found in part (a). You should clearly state your hypotheses, the degrees of freedom, the $p$-value and your conclusion.
d. Use the normal distribution model to find the score required to pass.
e. Perform a suitable test, at the $5 \%$ significance level, to determine if there is a difference between the mean scores of males and females. You should clearly state your hypotheses, the $p$-value and your conclusion.
f. Perform a suitable test, at the $5 \%$ significance level, to determine if it is easier to achieve a distinction on the new exam. You should clearly state your hypotheses, the critical region and your conclusion.
g.i.Find the probability of making a Type I error.
g.ii.Given that $p=0.2$ find the probability of making a Type II error.

▶️Answer/Explanation

a.i. 52.8
A1
[1 mark]
a.i. $s_{n-1}^2=23.7^2=562 \quad$ M1A1
[2 marks]
b. $P(20<x \leqslant 40)=0.211$ M1A1
$0.211 \times 149 \quad$ M1
$=31.5 \quad A G$
[3 marks]
c. use of a $\chi^2$ goodness of fit test $\mathbf{M 1}$
$H_0: x \sim N(52.8,562)$ and $H_1: x \sim N(52.8,562) \quad$ A1A1
$v=5-1-2=2 \quad \boldsymbol{A 1}$
$p$-value $=0.569 \quad$ A2
Since $0.569>0.05 \quad \boldsymbol{R 1}$
Insufficient evidence to reject $H_0$. The scores follow a normal distribution.
A1
[8 marks]
d. $\Phi^{-1}(0.2)=32.8 \quad$ M1A1
[2 marks]
e. use of a $t$-test $\boldsymbol{M 1}$
$H_0: \mu_m=\mu_f$ and $H_1: \mu_m \neq \mu_f \quad \boldsymbol{A 1}$
p-value $=0.180 \quad$ A2
Since $0.180>0.05$
R1
Insufficient evidence to reject $H_0$. There is no difference between males and females.
A1
[6 marks]

f. use of test for proportion using Binomial distribution M1
$$
\begin{aligned}
& H_0: p=0.15 \text { and } H_1: p>0.15 \quad \text { A1 } \\
& P(X \geqslant 6)=0.0673 \text { and } P(X \geqslant 7)=0.0219 \quad \text { M1 }
\end{aligned}
$$

So the critical region is $X \geqslant 7 \quad$ A1
Since $5<7 \quad \boldsymbol{R 1}$
Insufficient evidence to reject $H_0$. It is not easier to achieve a distinction on the new exam.
A1
[6 marks]
g.i.using $H_0, X \sim B(16,0.15) \quad M 1$
$$
P(X>3)=0.210 \quad \text { M1A1 }
$$
[3 marks]
g.ii.using $H_1, X \sim B(16,0.2) \quad M 1$
$$
P(X \leqslant 3)=0.598 \quad \text { M1A1 }
$$
[3 marks]

 

Question

A farmer sells bags of potatoes which he states have a mean weight of $7 \mathrm{~kg}$. An inspector, however, claims that the mean weight is less than $7 \mathrm{~kg}$. Ir order to test this claim, the inspector takes a random sample of 12 of these bags and determines the weight, $x \mathrm{~kg}$, of each bag. He finds that
$$
\sum x=83.64 ; \sum x^2=583.05 \text {. }
$$

You may assume that the weights of the bags of potatoes can be modelled by the normal distribution $\mathrm{N}\left(\mu, \sigma^2\right)$.
a. State suitable hypotheses to test the inspector’s claim.
b. Find unbiased estimates of $\mu$ and $\sigma^2$.
c.i. Carry out an appropriate test and state the $p$-value obtained.
c.ii.Using a 10\% significance level and justifying your answer, state your conclusion in context.

▶️Answer/Explanation

a.$H_0: \mu=7, H_1: \mu<7 \quad$ A1
[1 mark]
b. $\bar{x}=\frac{83.64}{12}=6.97 \quad \boldsymbol{A 1}$
$s_{n-1}^2=\frac{583.05}{11}-\frac{83.64^2}{132}=0.0072 \quad$ (M1)A1
[3 marks]
c.i. $t=\frac{6.97-7}{\sqrt{\frac{0.0072}{12}}}=-1.22(474 \ldots) \quad($ M1)(A1)
degrees of freedom $=11$
(A1)
$p$ – value $=0.123 \quad$ A1
Note: Accept any answer that rounds correctly to 0.12 .
[4 marks]
c.ii.because $p>0.1 \quad \boldsymbol{R} 1$
the inspector’s claim is not supported (at the 10\% level)
(or equivalent in context) A1
Note: Only award the $\boldsymbol{A 1}$ if the $\boldsymbol{R} 1$ has been awarded
[2 marks]

Question

John rings a church bell 120 times. The time interval, $T_i$, between two successive rings is a random variable with mean of 2 seconds and variance of $\frac{1}{9}$ seconds $^2$.

Each time interval, $T_i$, is independent of the other time intervals. Let $X=\sum_{i=1}^{119} T_i$ be the total time between the first ring and the last ring.

The church vicar subsequently becomes suspicious that John has stopped coming to ring the bell and that he is letting his friend Ray do it. When Ra) rings the bell the time interval, $T_i$ has a mean of 2 seconds and variance of $\frac{1}{25}$ seconds $^2$.

The church vicar makes the following hypotheses:
$H_0$ : Ray is ringing the bell; $H_1$ : John is ringing the bell.
He records four values of $X$. He decides on the following decision rule:
If $236 \leqslant X \leqslant 240$ for all four values of $X$ he accepts $H_0$, otherwise he accepts $H_1$.
a. Find
(i) $\mathrm{E}(X)$;
(ii) $\operatorname{Var}(X)$.
b. Explain why a normal distribution can be used to give an approximate model for $X$.
c. Use this model to find the values of $A$ and $B$ such that $\mathrm{P}(A<X<B)=0.9$, where $A$ and $B$ are symmetrical about the mean of $X$.
d. Calculate the probability that he makes a Type II error.

▶️Answer/Explanation

a.

(i) mean $=119 \times 2=238$
A1
(ii) $\quad$ variance $=119 \times \frac{1}{9}=\frac{119}{9}(=13.2)$
(M1)A1

Note: If 120 is used instead of 119 award $\mathbf{A O}$ (M1)AO for part (a) and apply follow through for parts (b)-(d). (b) is unaffected and in (c) the interval becomes $(234,246)$. In (d) the first $2 \boldsymbol{A 1}$ marks are for $0.3633 \ldots$ and $0.0174 \ldots$ so the final answer will round to 0.017 .
[3 marks]
b. justified by the Central Limit Theorem $\quad \boldsymbol{R 1}$
since $n$ is large
A1
Note: Accept $n>30$.
[2 marks]
c. $X \sim N\left(238, \frac{119}{9}\right)$
$Z=\frac{X-238}{\frac{\sqrt{119}}{3}} \sim N(0,1) \quad$ (M1)(A1)
$\mathrm{P}(Z<q)=0.95 \Rightarrow q=1.644 \ldots \quad$ (A1) so $\mathrm{P}(-1.644 \ldots<Z<1.644 \ldots)=0.9$
$\mathrm{P}\left(-1.644 \ldots<\frac{X-238}{\frac{\sqrt{119}}{3}}<1.644 \ldots\right)=0.9 \quad$ (M1)
interval is $232<X<244(3 \mathrm{sf})(A=232, B=244)$
A1A1
Notes: Accept the use of inverse normal applied to the distribution of $X$.
Alternative is to use the GDC to find a pretend $Z$ confidence interval for a mean and then convert by multiplying by 119 .
Either $A$ or $B$ correct implies the five implied marks.
Accept any numbers that round to these 3 sf numbers.
[7 marks]

Question

A smartphone’s battery life is defined as the number of hours a fully charged battery can be used before the smartphone stops working. A company claims that the battery life of a model of smartphone is, on average, 9.5 hours. To test this claim, an experiment is conducted on a random sample of 20 smartphones of this model. For each smartphone, the battery life, $b$ hours, is measured and the sample mean, $\bar{b}$, calculated. It can be assumed the battery lives are normally distributed with standard deviation 0.4 hours.

It is then found that this model of smartphone has an average battery life of 9.8 hours.
a. State suitable hypotheses for a two-tailed test.
$[1]$
b. Find the critical region for testing $\bar{b}$ at the $5 \%$ significance level.
c. Find the probability of making a Type II error.
d. Another model of smartphone whose battery life may be assumed to be normally distributed with mean $\mu$ hours and standard deviation 1.2 hours is tested. A researcher measures the battery life of six of these smartphones and calculates a confidence interval of [10.2, 11.4] for $\mu$.
Calculate the confidence level of this interval.

▶️Answer/Explanation

a.Note: In question 3, accept answers that round correctly to 2 significant figures.
$\mathrm{H}_0: \mu=9.5 ; \mathrm{H}_1: \mu \neq 9.5 \quad$ A1
[1 mark]
b. Note: In question 3 , accept answers that round correctly to 2 significant figures. the critical values are $9.5 \pm 1.95996 \ldots \times \frac{0.4}{\sqrt{20}} \quad$ (M1)(A1)
i.e. $9.3247 \ldots, 9.6753 \ldots$
the critical region is $\bar{b}<9.32, \bar{b}>9.68 \quad$ A1A1
Note: Award A1 for correct inequalities, A1 for correct values.
Note: Award $M 0$ if $t$-distribution used, note that $t(19)_{97.5}=2.093 \ldots$
[4 marks]
c. Note: In question 3, accept answers that round correctly to 2 significant figures.
$$
\begin{aligned}
& \bar{B} \sim \mathrm{N}\left(9.8,\left(\frac{0.4}{\sqrt{20}}\right)^2\right) \quad \text { (A1) } \\
& \mathrm{P}(9.3247 \ldots<\bar{B}<9.6753 \ldots) \\
& =0.0816 \quad \text { A1 }
\end{aligned}
$$
(M1)
Note: FT the critical values from (b). Note that critical values of 9.32 and 9.68 give 0.0899 .
[3 marks]
d. Note: In question 3, accept answers that round correctly to 2 significant figures.
METHOD 1
$$
\begin{aligned}
& X \sim \mathrm{N}\left(10.8, \frac{1.2^2}{6}\right) \quad \text { (M1)(A1) } \\
& \mathrm{P}(10.2<X<11.4)=0.7793 \ldots
\end{aligned}
$$
confidence level is $77.9 \%$
A1
Note: Accept $78 \%$.
METHOD 2
$11.4-10.2=2 z \times \frac{1.2}{\sqrt{6}} \quad$ (M1)
$z=1.224 \ldots \quad$ (A1)
$\mathrm{P}(-1.224 \ldots<Z<1.224 \ldots)=0.7793 \ldots$
confidence level is $77.9 \%$
A1
Note: Accept 78\%.

Question

Anne is a farmer who grows and sells pumpkins. Interested in the weights of pumpkins produced, she records the weights of eight pumpkins and obtains the following results in kilograms.
$\begin{array}{llllllll}7.7 & 7.5 & 8.4 & 8.8 & 7.3 & 9.0 & 7.8 & 7.6\end{array}$

Assume that these weights form a random sample from a $N\left(\mu, \sigma^2\right)$ distribution.

Anne claims that the mean pumpkin weight is 7.5 kilograms. In order to test this claim, she sets up the null hypothesis $\mathrm{H}_0: \mu=7.5$.
a. Determine unbiased estimates for $\mu$ and $\sigma^2$.
b.i. Use a two-tailed test to determine the $p$-value for the above results.
b.iiJnterpret your $p$-value at the $5 \%$ level of significance, justifying your conclusion.

▶️Answer/Explanation

a.

UE of $\mu$ is $8.01(=8.0125)$
A1
UE of $\sigma^2$ is 0.404
(M1)A1

Note: Accept answers that round correctly to 2 sf.

Note: Condone incorrect notation, ie, $\mu$ instead of UE of $\mu$ and $\sigma^2$ instead of UE of $\sigma^2$.

Note: $\quad$ MO for squaring $0.594 \ldots$ giving 0.354, M1AO for failing to square $0.635 \ldots$
[3 marks]
b.i.attempting to use the $t$-test
(M1)
$p$-value is $0.0566 \quad$ A2
Note: Accept any answer that rounds correctly to 2 sf.
[3 marks]
b.ii.0.0566 $>0.05 \quad \boldsymbol{R 1}$
we accept the null hypothesis (mean pumpkin weight is $7.5 \mathrm{~kg}$ )
A1
Note: Apply follow through on the candidate’s $p$-value.

Note: Do not award $\boldsymbol{A 1}$ if $\boldsymbol{R 1}$ is not awarded.
[2 marks]

Question

Two IB schools, A and B, follow the IB Diploma Programme but have different teaching methods. A research group tested whether the different teaching methods lead to a similar final result.

For the test, a group of eight students were randomly selected from each school. Both samples were given a standardized test at the start of the course and a prediction for total IB points was made based on that test; this was then compared to their points total at the end of the course.

Previous results indicate that both the predictions from the standardized tests and the final IB points can be modelled by a normal distribution.
It can be assumed that:
– the standardized test is a valid method for predicting the final IB points
– that variations from the prediction can be explained through the circumstances of the student or school.

The data for school $A$ is shown in the following table.

For each student, the change from the predicted points to the final points $(f-p)$ was calculated.

The data for school B is shown in the following table.

School A also gives each student a score for effort in each subject. This effort score is based on a scale of 1 to 5 where 5 is regarded as outstanding effort.

It is claimed that the effort put in by a student is an important factor in improving upon their predicted IB points.

A mathematics teacher in school A claims that the comparison between the two schools is not valid because the sample for school B contained mainly girls and that for school A, mainly boys. She believes that girls are likely to show a greater improvement from their predicted points to their final points.

She collects more data from other schools, asking them to class their results into four categories as shown in the following table.

a. Identify a test that might have been used to verify the null hypothesis that the predictions from the standardized test can be modelled by a normal distribution.
b. State why comparing only the final IB points of the students from the two schools would not be a valid test for the effectiveness of the two different teaching methods.
c.i. Find the mean change.
c.ii.Find the standard deviation of the changes.
d. Use a paired $t$-test to determine whether there is significant evidence that the students in school A have improved their IB points since the start of the course.
e.i. Use an appropriate test to determine whether there is evidence, at the $5 \%$ significance level, that the students in school B have improved more than those in school A.
e.ii.State why it was important to test that both sets of points were normally distributed.
f.i. Perform a test on the data from school A to show it is reasonable to assume a linear relationship between effort scores and improvements in IB points. You may assume effort scores follow a normal distribution.
f.ii. Hence, find the expected improvement between predicted and final points for an increase of one unit in effort grades, giving your answer to one decimal place.
g. Use an appropriate test to determine whether showing an improvement is independent of gender.
h. If you were to repeat the test performed in part (e) intending to compare the quality of the teaching between the two schools, suggest two ways in which you might choose your sample to improve the validity of the test.

▶️Answer/Explanation

a. $\chi^2$ (goodness of fit)
A1
[1 mark]
b. EITHER
because aim is to measure improvement
OR
because the students may be of different ability in the two schools
R1
[1 mark]
c.i. 0.1875 (accept $0.188,0.19) \quad \boldsymbol{A 1}$
[1 mark]
c.ii.2.46
(M1)A1
Note: Award (M1)AO for 2.63.
[2 marks]
d. $\mathrm{H}_0$ : there has been no improvement
$\mathrm{H}_1$ : there has been an improvement
A1
attempt at a one-tailed paired $t$-test
(M1)
$$
p \text {-value }=0.423
$$
A1
there is no significant evidence that the students have improved
R1
Note: If the hypotheses are not stated award a maximum of AOM1A1RO.
[4 marks]
e.i. $\mathrm{H}_0$ : there is no difference between the schools
$\mathrm{H}_1$ : school $\mathrm{B}$ did better than school $\mathrm{A}$
A1
one-tailed 2 sample $t$-test
(M1)
$p$-value $=0.0984$
A1
$0.0984>0.05$ (not significant at the $5 \%$ level) so do not reject the null hypothesis
R1A1

Note: The final $\boldsymbol{A 1}$ cannot be awarded following an incorrect reason. The final R1A1 can follow through from their incorrect $p$-value. Award a maximum of $\boldsymbol{A 1}$ (M1)AOR1A1 for $p$-value $=0.0993$.
[5 marks]

e.ii.sample too small for the central limit theorem to apply (and $t$-tests assume normal distribution)
R1
[1 mark]
f.i. $\mathrm{H}_0: \rho=0$
$$
\mathrm{H}_0: \rho>0
$$

Note: Allow hypotheses to be expressed in words.
$p$-value $=0.00157$
A1
$(0.00157<0.01)$ there is a significant evidence of a (linear) correlation between effort and improvement (so it is reasonable to assume a linear relationship) $\quad \boldsymbol{R 1}$
[3 marks]
f.ii. (gradient of line of regression $=$ ) $6.6 \quad \boldsymbol{A 1}$
[1 mark]
g. $\mathrm{H}_0$ : improvement and gender are independent
$\mathrm{H}_1$ : improvement and gender are not independent
A1
choice of $\chi^2$ test for independence
(M1)
groups first two columns as expected values in first column less than $5 \quad$ M1
new observed table

$$
p \text {-value }=0.581
$$
no significant evidence that gender and improvement are dependent
R1
[6 marks]
h. For example:
larger samples / include data from whole school
take equal numbers of boys and girls in each sample
have a similar range of abilities in each sample
(if possible) have similar ranges of effort
R1R1
Note: Award $\boldsymbol{R 1}$ for each reasonable suggestion to improve the validity of the test.
[2 marks]

Question

 

Scroll to Top