IBDP MAI :Topic 4: Statistics and probability-AHL 4.14-Linear transformation of a single random variable.Exam Style Questions Paper 3

Question

Mr Sailor owns a fish farm and he claims that the weights of the fish in one of his lakes have a mean of 550 grams and standard deviation of 8 grams.

Assume that the weights of the fish are normally distributed and that Mr Sailor’s claim is true.

Kathy is suspicious of Mr Sailor’s claim about the mean and standard deviation of the weights of the fish. She collects a random sample of fish from this lake whose weights are shown in the following table.

Using these data, test at the $5 \%$ significance level the null hypothesis $H_0: \mu=550$ against the alternative hypothesis $H_1: \mu<550$, where $\mu$ grams is the population mean weight.

Kathy decides to use the same fish sample to test at the $5 \%$ significance level whether or not there is a positive association between the weights and the lengths of the fish in the lake. The following table shows the lengths of the fish in the sample. The lengths of the fish can be assumed to be normally distributed.

a.i. Find the probability that a fish from this lake will have a weight of more than 560 grams.
a.ii.The maximum weight a hand net can hold is $6 \mathrm{~kg}$. Find the probability that a catch of 11 fish can be carried in the hand net.
b.i.State the distribution of your test statistic, including the parameter.
b.iiFind the $p$-value for the test.
b.iiiState the conclusion of the test, justifying your answer.
c.i. State suitable hypotheses for the test.
c.ii.Find the product-moment correlation coefficient $r$.
c.iiiState the $p$-value and interpret it in this context.
d. Use an appropriate regression line to estimate the weight of a fish with length $360 \mathrm{~mm}$.

▶️Answer/Explanation

a.i. Note: Accept all answers that round to the correct 2 sf answer in (a), (b) and (c) but not in (d).
$$
\begin{aligned}
& X \sim N\left(550,8^2\right) \quad \text { (M1) } \\
& \mathrm{P}(X>560)-0.10564 \ldots=0.106
\end{aligned}
$$
A1
[2 marks]
a.ii.Note: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$$
\begin{aligned}
& X_i \sim N\left(550,8^2\right), i=1, \ldots, 11 \\
& \text { let } Y=\sum_{i=1}^{11} X_i \\
& \mathrm{E}(\mathrm{Y})=11 \times 550(6050) \quad \text { A1 } \\
& \operatorname{Var}(\mathrm{Y})=11 \times 8^2 \quad(704) \quad \text { (M1)A1 } \\
& \mathrm{P}(Y \leqslant 6000)=0.02975 \ldots=0.0298
\end{aligned}
$$
A1
[4 marks]
b.i. Note: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$t$ distribution with 7 degrees of freedom
A1A1
[2 marks]
b.ii.Note: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$$
p=0.25779 \ldots=0.258 \quad \text { A2 }
$$
[2 marks]

b.iiiNote: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$$
p>0.05
$$
R1
therefore we conclude that there is no evidence to reject $H_0$
A1
Note: $\boldsymbol{F T}$ their $p$-value.
Note: Only award $\mathbf{A 1}$ if $\boldsymbol{R 1}$ awarded.
[2 marks]
c.i. Note: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$$
H_0: \rho=0, H_1: \rho>0
$$
A1
Note: Do not accept $r$ in place of $\rho$.
[1 mark]
c.ii.Note: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
$$
r=0.782 \quad \text { A2 }
$$
[2 marks]
c.iiiNote: Accept all answers that round to the correct 2sf answer in (a), (b) and (c) but not in (d).
0.01095… $=0.0110 \quad$ A1
since $0.0110<0.05 \quad \boldsymbol{R 1}$
there is positive association between weight and length
A1
Note: $\boldsymbol{F T}$ their p-value.
Note: Only award $\boldsymbol{A 1}$ if $\boldsymbol{R 1}$ awarded.
Note: Conclusion must be in context.
[3 marks]

d. Note: Accept all answers that round to the correct 2 sf answer in (a), (b) and (c) but not in (d).
regression line of $y$ (weight) on $x$ (length) is $\quad$ (M1)
$$
y=0.8267 \ldots x+255.96 \ldots
$$
$x=360$ gives $y=554$
A1

Note: Award M1AOAO for the wrong regression line, that is $y=0.7393 \ldots x-51.62 \ldots$.
[3 marks]

 

Question

Employees answer the telephone in a customer relations department. The time taken for an employee to deal with a customer is a random variable which can be modelled by a normal distribution with mean 150 seconds and standard deviation 45 seconds.
a. Find the probability that the time taken for a randomly chosen customer to be dealt with by an employee is greater than 180 seconds.
b. Find the probability that the time taken by an employee to deal with a queue of three customers is less than nine minutes.
c. At the start of the day, one employee, Amanda, has a queue of four customers. A second employee, Brian, has a queue of three customers. You may assume they work independently.
Find the probability that Amanda’s queue will be dealt with before Brian’s queue.

▶️Answer/Explanation

a.

Note: In question 2, accept answers that round correctly to 2 significant figures.
$$
\begin{aligned}
& X \sim N\left(150,45^2\right) \\
& \mathrm{P}(X>80)=0.252 \quad \text { (M1)A1 }
\end{aligned}
$$
(M1)A1
[2 marks]
b. Note: In question 2, accept answers that round correctly to 2 significant figures.
required to find $\mathrm{P}\left(X_1+X_2+X_3<540\right)$
$$
\begin{aligned}
& \text { let } S=X_1+X_2+X_3 \\
& \mathrm{E}(S)=450 \quad \text { (A1) } \\
& \operatorname{Var}(S)=3 \operatorname{Var}(X) \quad \text { (M1) } \\
& =3 \times 45^2 \quad(\Rightarrow \sigma=45 \sqrt{3})(=6075) \\
& \mathrm{P}(S<540)=0.876 \quad \text { A1 }
\end{aligned}
$$

Note: $\ln$ (b) and (c) condone incorrect notation, eg, $3 X$ for $X_1+X_2+X_3$.
[4 marks]
c. Note: In question 2, accept answers that round correctly to 2 significant figures.
$$
\begin{aligned}
& \text { let } Y=\left(X_1+X_2+X_3+X_4\right)-\left(X_5+X_6+X_7\right) \\
& \mathrm{E}(Y)=\mathrm{E}(X)=150 \\
& \operatorname{Var}(Y)=4 \operatorname{Var}(X)+3 \operatorname{Var}(X)(=7 \operatorname{Var}(X)) \\
& =14175
\end{aligned}
$$
required to find $\mathrm{P}(Y<0) \quad$ (M1)
$$
=0.104
$$
A1
[6 marks]

Question

The weights, $X \mathrm{~kg}$, of the males of a species of bird may be assumed to be normally distributed with mean $4.8 \mathrm{~kg}$ and standard deviation $0.2 \mathrm{~kg}$.

The weights, $Y \mathrm{~kg}$, of female birds of the same species may be assumed to be normally distributed with mean $2.7 \mathrm{~kg}$ and standard deviation $0.15 \mathrm{~kg}$
a. Find the probability that a randomly chosen male bird weighs between $4.75 \mathrm{~kg}$ and $4.85 \mathrm{~kg}$.
b. Find the probability that the weight of a randomly chosen male bird is more than twice the weight of a randomly chosen female bird.
c. Two randomly chosen male birds and three randomly chosen female birds are placed on a weighing machine that has a weight limit of $18 \mathrm{~kg}$.
Find the probability that the total weight of these five birds is greater than the weight limit.

▶️Answer/Explanation

a.Note: In question 1, accept answers that round correctly to 2 significant figures.
$$
\mathrm{P}(4.75<X<4.85)=0.197
$$
A1
[1 mark]
b. Note: In question 1, accept answers that round correctly to 2 significant figures.
consider the random variable $X-2 Y \quad$ (M1)
$$
\begin{aligned}
& \mathrm{E}(X-2 \mathrm{Y})=-0.6 \quad \text { (A1) } \\
& \operatorname{Var}(X-2 Y)=\operatorname{Var}(X)+4 \operatorname{Var}(Y \quad(\text { M1 }) \\
& =0.13 \quad \text { (A1) } \\
& X-2 Y \sim \mathrm{N}(-0.6,0.13) \\
& \mathrm{P}(X-2 Y>0) \quad(\text { M1) } \\
& =0.0480 \quad \text { A1 } \\
& \text { [6 marks] }
\end{aligned}
$$
[6 marks]
c. Note: In question 1, accept answers that round correctly to 2 significant figures.
let $W=X_1+X_2+Y_1+Y_2+Y_3$ be the total weight
$$
E(W)=17.7
$$
$$
\begin{aligned}
& \operatorname{Var}(W)=2 \operatorname{Var}(X)+3 \operatorname{Var}(Y)=0.1475 \quad \text { (M1)(A1) } \\
& \mathrm{W} \sim \mathrm{N}(17.7,0.1475) \\
& \mathrm{P}(\mathrm{W}>18)=0.217 \quad \text { A1 }
\end{aligned}
$$
[4 marks]

Question

This question explores methods to analyse the scores in an exam.
A random sample of 149 scores for a university exam are given in the table.

The university wants to know if the scores follow a normal distribution, with the mean and variance found in part (a).

The expected frequencies are given in the table.

The university assigns a pass grade to students whose scores are in the top $80 \%$.

The university also wants to know if the exam is gender neutral. They obtain random samples of scores for male and female students. The mean, sample variance and sample size are shown in the table.

The university awards a distinction to students who achieve high scores in the exam. Typically, 15\% of students achieve a distinction. A new exam is trialed with a random selection of students on the course. 5 out of 20 students achieve a distinction.

A different exam is trialed with 16 students. Let $p$ be the percentage of students achieving a distinction. It is desired to test the hypotheses
$$
H_0: p=0.15 \text { against } H_1: p>0.15
$$

It is decided to reject the null hypothesis if the number of students achieving a distinction is greater than 3 .a.i. Find unbiased estimates for the population mean.
a.ii.Find unbiased estimates for the population Variance.
b. Show that the expected frequency for $20<x \leq 4$ is 31.5 correct to 1 decimal place.
c. Perform a suitable test, at the $5 \%$ significance level, to determine if the scores follow a normal distribution, with the mean and variance found in part (a). You should clearly state your hypotheses, the degrees of freedom, the $p$-value and your conclusion.
d. Use the normal distribution model to find the score required to pass.
e. Perform a suitable test, at the $5 \%$ significance level, to determine if there is a difference between the mean scores of males and females. You should clearly state your hypotheses, the $p$-value and your conclusion.
f. Perform a suitable test, at the $5 \%$ significance level, to determine if it is easier to achieve a distinction on the new exam. You should clearly state your hypotheses, the critical region and your conclusion.
g.i. Find the probability of making a Type I error.
g.ii.Given that $p=0.2$ find the probability of making a Type II error.

▶️Answer/Explanation

a.i. 52.8
A1
[1 mark]
a.i. $s_{n-1}^2=23.7^2=562 \quad$ M1A1
[2 marks]
b. $P(20<x \leqslant 40)=0.211$ M1A1
$0.211 \times 149 \quad$ M1
$=31.5 \quad A G$
[3 marks]
c. use of a $\chi^2$ goodness of fit test $M 1$
$H_0: x \sim N(52.8,562)$ and $H_1: x \sim N(52.8,562) \quad$ A1A1
$v=5-1-2=2 \quad \boldsymbol{A 1}$
$p$-value $=0.569 \quad$ A2
Since $0.569>0.05$
R1
Insufficient evidence to reject $H_0$. The scores follow a normal distribution.
A1
[8 marks]
d. $\Phi^{-1}(0.2)=32.8 \quad$ M1A1
[2 marks]
e. use of a $t$-test $\boldsymbol{M 1}$
$H_0: \mu_m=\mu_f$ and $H_1: \mu_m \neq \mu_f \quad \boldsymbol{A 1}$
p-value $=0.180 \quad$ A2
Since $0.180>0.05$
R1
Insufficient evidence to reject $H_0$. There is no difference between males and females.
A1
f. use of test for proportion using Binomial distribution M1
$H_0: p=0.15$ and $H_1: p>0.15 \quad$ A1
$P(X \geqslant 6)=0.0673$ and $P(X \geqslant 7)=0.0219 \quad$ M1
So the critical region is $X \geqslant 7 \quad$ A1
Since $5<7 \quad \boldsymbol{R 1}$
Insufficient evidence to reject $H_0$. It is not easier to achieve a distinction on the new exam.
A1
[6 marks]
g.i. using $H_0, X \sim B(16,0.15) \quad M 1$
$P(X>3)=0.210 \quad$ M1A1
[3 marks]
g.ii.using $H_1, X \sim B(16,0.2) \quad$ M1
$P(X \leqslant 3)=0.598$ M1A1
[3 marks]

 

 

 

Question

An estate manager is responsible for stocking a small lake with fish. He begins by introducing 1000 fish into the lake and monitors their population growth to determine the likely carrying capacity of the lake.

After one year an accurate assessment of the number of fish in the lake is taken and it is found to be 1200 .
Let $N$ be the number of fish $t$ years after the fish have been introduced to the lake.
Initially it is assumed that the rate of increase of $N$ will be constant.

When $t=8$ the estate manager again decides to estimate the number of fish in the lake. To do this he first catches 300 fish and marks them, so they can be recognized if caught again. These fish are then released back into the lake. A few days later he catches another 300 fish, releasing each fish after it has been checked, and finds 45 of them are marked.

Let $X$ be the number of marked fish caught in the second sample, where $X$ is considered to be distributed as $\mathrm{B}(n, p)$. Assume the number of fish in the lake is 2000 .

The estate manager decides that he needs bounds for the total number of fish in the lake.

The estate manager feels confident that the proportion of marked fish in the lake will be within 1.5 standard deviations of the proportion of marked fish in the sample and decides these will form the upper and lower bounds of his estimate.

The estate manager now believes the population of fish will follow the logistic model $N(t)=\frac{L}{1+C e^{-k t}}$ where $L$ is the carrying capacity and $C, k>0$. The estate manager would like to know if the population of fish in the lake will eventually reach 5000.

a. Use this model to predict the number of fish in the lake when $t=8$.
b. Assuming the proportion of marked fish in the second sample is equal to the proportion of marked fish in the lake, show that the estate manager will estimate there are now 2000 fish in the lake.
c.i. Write down the value of $n$ and the value of $p$.
c.ii.State an assumption that is being made for $X$ to be considered as following a binomial distribution.
d.i. Show that an estimate for $\operatorname{Var}(X)$ is 38.25 .
d.iiHence show that the variance of the proportion of marked fish in the sample, $\operatorname{Var}\left(\frac{X}{300}\right)$, is 0.000425 .
e.i. Taking the value for the variance given in (d) (ii) as a good approximation for the true variance, find the upper and lower bounds for the proportion of marked fish in the lake.
e.ii.Hence find upper and lower bounds for the number of fish in the lake when $t=8$.
f. Given this result, comment on the validity of the linear model used in part (a).
g.i. Assuming a carrying capacity of 5000 use the given values of $N(0)$ and $N(1)$ to calculate the parameters $C$ and $k$.
g.ii.Use these parameters to calculate the value of $N(8)$ predicted by this model.
h. Comment on the likelihood of the fish population reaching 5000.

▶️Answer/Explanation

a.$$
\begin{aligned}
& N(8)=1000+200 \times 8 \quad \text { M1 } \\
& =2600 \quad \text { A1 }
\end{aligned}
$$
[2 marks]

b. $\frac{45}{300}=\frac{300}{N} \quad$ M1A1
$$
N=2000
$$
$A G$
[2 marks]
c.i. $n=300, p=\frac{300}{2000}=0.15 \quad$ A1A1
[2 marks]
c.ii.Any valid reason for example:
R1
Marked fish are randomly distributed, so $p$ constant.
Each fish caught is independent of previous fish caught
[1 mark]
$$
\begin{array}{ll}
\text { d.i. } \operatorname{Var}(X)=n p(1-p) & \text { M1 } \\
=300 \times \frac{300}{2000} \times \frac{1700}{2000} \quad \text { A1 } \\
=38.25 \quad \text { AG }
\end{array}
$$
[2 marks]
$$
\begin{array}{cc}
\text { d.ii. } \operatorname{Var}\left(\frac{X}{300}\right)=\frac{\operatorname{Var}(X)}{300^2} & \text { M1A1 } \\
=0.000425 \quad \text { AG }
\end{array}
$$
[2 marks]
e.i. $0.15 \pm 1.5 \sqrt{0.000425}$
(M1)
0.181 and 0.119
A1
[2 marks]

e.ii. $\frac{300}{N}=0.181 \ldots, \frac{300}{N}=0.119 \ldots \quad$ M1
Lower bound 1658 upper bound 2519
A1
[2 marks]
f. Linear model prediction falls outside this range so unlikely to be a good model
R1A1
[2 marks]
$$
\begin{aligned}
& \text { g.i. } 1000=\frac{5000}{1+C} \quad \text { M1 } \\
& C=4 \quad \text { A1 } \\
& 1200=\frac{5000}{1+4 e^{-k}} \quad \text { M1 } \\
& e^{-k}=\frac{3800}{4 \times 1200} \quad \text { (M1) } \\
& k=-\ln (0.7916 \ldots)=0.2336 \ldots
\end{aligned}
$$
A1
[5 marks]
$$
\text { g.ii. } N(8)=\frac{5000}{1+4 e^{-0.2336 \times 8}}=3090 \quad \text { M1A1 }
$$

Note: Accept any answer that rounds to 3000 .
[2 marks]
h. This is much higher than the calculated upper bound for $N(8)$ so the rate of growth of the fish is unlikely to be sufficient to reach a carrying capacity of 5000 . M1R1
[2 marks]

 

Question

A shop sells carrots and broccoli. The weights of carrots can be modelled by a normal distribution with variance 25 grams $^2$ and the weights of broccoli can be modelled by a normal distribution with variance $80 \mathrm{grams}^2$. The shopkeeper claims that the mean weight of carrots is 130 grams and the mean weight of broccoli is 400 grams.

Dong Wook decides to investigate the shopkeeper’s claim that the mean weight of carrots is 130 grams. He plans to take a random sample of $n$ carrots in order to calculate a $98 \%$ confidence interval for the population mean weight.

Anjali thinks the mean weight, $\mu$ grams, of the broccoli is less than 400 grams. She decides to perform a hypothesis test, using a random sample of size 8. Her hypotheses are
$$
H_0: \mu=400 ; H_1: \mu<400 .
$$

She decides to reject $H_0$ if the sample mean is less than 395 grams.
a. Assuming that the shopkeeper’s claim is correct, find the probability that the weight of six randomly chosen carrots is more than two times the weight of one randomly chosen broccoli.
b. Find the least value of $n$ required to ensure that the width of the confidence interval is less than 2 grams.
c. Find the significance level for this test.
d. Given that the weights of the broccoli actually follow a normal distribution with mean 392 grams and variance 80 grams $^2$, find the probability of [3]
Anjali making a Type II error.

▶️Answer/Explanation

a.

\begin{tabular}{ll}
Let $X=\sum_{i=1}^6 C_i-2 B \quad$ M1 \\
$\mathrm{E}(X)=6 \times 130-2 \times 400=-20$ & (M1)(A1) \\
$\operatorname{Var}(X)=6 \times 25+4 \times 80=470$ & (M1)(A1) \\
$\mathrm{P}(X>0)=0.178 \quad$ A1
\end{tabular}

Note: Condone the notation $6 C-2 B$ only if the (M1) is awarded for the variance.
[6 marks]
b. $z=2.326 \ldots$
(A1)
$$
\begin{aligned}
& \frac{2 z \sigma}{\sqrt{n}}<2 \quad \text { M1 } \\
& \sqrt{n}>11.6 \ldots \\
& n>135.2 \ldots \\
& n=136 \quad \text { A1 }
\end{aligned}
$$

Note: Condone the use of equal signs.
[3 marks]
c. variance $=\frac{80}{8}=10$
under $H_0, \bar{B} \sim \mathrm{N}(400,10)$
significance level $=\mathrm{P}(\bar{B}<395)$
(M1)
$=0.0569$ or $5.69 \%$
A1
Note: Accept any answer that rounds to 0.057 or $5.7 \%$.
[3 marks]

d.
$$
\begin{aligned}
\text { Type II error probability } & =\mathrm{P}\left(\text { Accept } H_0 H_1 \text { true }\right) \quad \text { (M1) } \\
& =\mathrm{P}(\bar{B}>395 \bar{B} \approx N(392,10)) \\
& =0.171 \quad \text { A1 }
\end{aligned}
$$

Note: Accept any answer that rounds to 0.17 .
[3 marks]

Question

In a large population of hens, the weight of a hen is normally distributed with mean $\mu \mathrm{kg}$ and standard deviation $\sigma \mathrm{kg}$. A random sample of 100 hens is taken from the population.

The mean weight for the sample is denoted by $\bar{X}$.

The sample values are summarized by $\sum x=199.8$ and $\sum x^2=407.8$ where $x \mathrm{~kg}$ is the weight of a hen.

It is found that $\sigma=0.27$. It is decided to test, at the $1 \%$ level of significance, the null hypothesis $\mu=1.95$ against the alternative hypothesis $\mu>1.95$
a. State the distribution of $\bar{X}$ giving its mean and variance.
b. Find an unbiased estimate for $\mu$.
c. Find an unbiased estimate for $\sigma^2$.
d. Find a $90 \%$ confidence interval for $\mu$.
e.i. Find the $p$-value for the test.
e.ii.Write down the conclusion reached.

▶️Answer/Explanation

a.$\bar{X} \sim N\left(\mu, \frac{\sigma^2}{100}\right) \quad \boldsymbol{A 1}$
Note: Accept $n$ in place of 100 .
[1 mark]
b. $\hat{\mu}=\frac{\sum x}{n}=\frac{199.8}{100}=1.998 \quad$ A1
Note: Accept 2.00, 2.0 and 2.
[1 mark]
c. $s_{n-1}^2=\frac{n}{n-1}\left(\frac{\sum x^2}{n}-\bar{x}^2\right)=\frac{100}{99}\left(\frac{407.8}{100}-1.998^2\right) \quad$ (M1)
$=0.086864$
unbiased estimate for $\sigma^2$ is 0.0869
A1
Note: Accept any answer which rounds to 0.087 .
[2 marks] $=(1.95,2.05) \quad$ A1A1
Note: $\boldsymbol{F T}$ their $\sigma$ from (c).
Note: Condone the use of the $z$-value 1.645 since $n$ is large.
Note: Accept any values that round to 1.95 and 2.05 .
[3 marks]
e.i. $p$-value is 0.0377
A2
Note: Award A1 for the 2-tail value 0.0754 .
Note: Award $\mathbf{A 2}$ for $\mathbf{0 . 0 3 7 7}$ and $\boldsymbol{A 1}$ for any other value that rounds to 0.038 .
Note: $\boldsymbol{F T}$ their estimated mean from (b), note that 2 gives $p=0.032(0)$.
[2 marks]
e.ii.accept the null hypothesis
A1
Note: $\boldsymbol{F T}$ their $p$-value.
[1 mark]

Scroll to Top