## Question

Ten friends try a diet which is claimed to reduce weight. They each weigh themselves before starting the diet, and after a month on the diet, with the following results.

Determine unbiased estimates of the mean and variance of the loss in weight achieved over the month by people using this diet.

(i) State suitable hypotheses for testing whether or not this diet causes a mean loss in weight.

(ii) Determine the value of a suitable statistic for testing your hypotheses.

(iii) Find the 1 % critical value for your statistic and state your conclusion.

**Answer/Explanation**

## Markscheme

the weight losses are

2.2\(\,\,\,\,\,\)3.5\(\,\,\,\,\,\)4.3\(\,\,\,\,\,\)–0.5\(\,\,\,\,\,\)4.2\(\,\,\,\,\,\)–0.2\(\,\,\,\,\,\)2.5\(\,\,\,\,\,\)2.7\(\,\,\,\,\,\)0.1\(\,\,\,\,\,\)–0.7 *(M1)(A1)*

\(\sum {x = 18.1} \), \(\sum {{x^2} = 67.55} \)

UE of mean = 1.81 *A1*

UE of variance \( = \frac{{67.55}}{9} – \frac{{{{18.1}^2}}}{{90}} = 3.87\) *(M1)A1*

**Note:** Accept weight losses as positive or negative. Accept unbiased estimate of mean as positive or negative.

** **

**Note:** Award ** M1A0** for 1.97 as UE of variance.

* *

*[5 marks]*

(i) \({H_0}:{\mu _d} = 0\) versus \({H_1}:{\mu _d} > 0\) *A1*

**Note:** Accept any symbol for \({\mu _d}\)

(ii) using *t* test *(M1)*

\(t = \frac{{1.81}}{{\sqrt {\frac{{3.87}}{{10}}} }} = 2.91\) *A1*

* *

(iii) DF = 9 *(A1)*

**Note:** Award this ** (A1)** if the p-value is given as 0.00864

1% critical value = 2.82 *A1*

accept \({H_1}\) *R1*

**Note:** Allow ** FT** on final

**.**

*R1** *

*[6 marks]*

## Examiners report

In (a), most candidates gave a correct estimate for the mean but the variance estimate was often incorrect. Some candidates who use their GDC seem to be unable to obtain the unbiased variance estimate from the numbers on the screen. The way to proceed, of course, is to realise that the larger of the two ‘standard deviations’ on offer is the square root of the unbiased estimate so that its square gives the required result. In (b), most candidates realised that the t-distribution should be used although many were awarded an arithmetic penalty for giving either *t* = 2.911 or the critical value = 2.821. Some candidates who used the *p*-value method to reach a conclusion lost a mark by omitting to give the critical value. Many candidates found part (c) difficult and although they were able to obtain *t* = 2.49…, they were then unable to continue to obtain the confidence interval.

In (a), most candidates gave a correct estimate for the mean but the variance estimate was often incorrect. Some candidates who use their GDC seem to be unable to obtain the unbiased variance estimate from the numbers on the screen. The way to proceed, of course, is to realise that the larger of the two ‘standard deviations’ on offer is the square root of the unbiased estimate so that its square gives the required result. In (b), most candidates realised that the t-distribution should be used although many were awarded an arithmetic penalty for giving either *t* = 2.911 or the critical value = 2.821. Some candidates who used the *p*-value method to reach a conclusion lost a mark by omitting to give the critical value. Many candidates found part (c) difficult and although they were able to obtain *t* = 2.49…, they were then unable to continue to obtain the confidence interval.

## Question

A baker produces loaves of bread that he claims weigh on average 800 g each. Many customers believe the average weight of his loaves is less than this. A food inspector visits the bakery and weighs a random sample of 10 loaves, with the following results, in grams:

783, 802, 804, 785, 810, 805, 789, 781, 800, 791.

Assume that these results are taken from a normal distribution.

Determine unbiased estimates for the mean and variance of the distribution.

In spite of these results the baker insists that his claim is correct.

Stating appropriate hypotheses, test the baker’s claim at the 10 % level of significance.

**Answer/Explanation**

## Markscheme

unbiased estimate of the mean: 795 (grams) *A1*

unbiased estimate of the variance: 108 \((gram{s^2})\) *(M1)A1*

*[3 marks]*

null hypothesis \({H_0}:\mu = 800\) *A1*

alternative hypothesis \({H_1}:\mu < 800\) *A1*

using 1-tailed *t*-test *(M1)*

**EITHER**

*p* = 0.0812… ** A3**

**OR**

with 9 degrees of freedom *(A1)*

\({t_{calc}} = \frac{{\sqrt {10} (795 – 800)}}{{\sqrt {108} }} = – 1.521\) ** A1**

\({t_{crit}} = – 1.383\) ** A1**

**Note: **Accept 2sf intermediate results.

**THEN**

so the baker’s claim is rejected *R1*** **

**Note: **Accept “reject \({H_0}\) ” provided \({H_0}\) has been correctly stated.

**Note: FT **for the final

**.**

*R1*

*[7 marks]*

## Examiners report

A successful question for many candidates. A few candidates did not read the question and adopted a 2-tailed test.

A successful question for many candidates. A few candidates did not read the question and adopted a 2-tailed test.

## Question

The random variable *X* is normally distributed with unknown mean \(\mu \) and unknown variance \({\sigma ^2}\). A random sample of 20 observations on *X* gave the following results.

\[\sum {x = 280,{\text{ }}\sum {{x^2} = 3977.57} } \]

Find unbiased estimates of \(\mu \) and \({\sigma ^2}\).

Determine a 95 % confidence interval for \(\mu \).

Given the hypotheses

\[{{\text{H}}_0}:\mu = 15;{\text{ }}{{\text{H}}_1}:\mu \ne 15,\]

find the *p*-value of the above results and state your conclusion at the 1 % significance level.

**Answer/Explanation**

## Markscheme

\(\bar x = 14\) *A1*

\(s_{n – 1}^2 = \frac{{3977.57}}{{19}} – \frac{{{{280}^2}}}{{380}}\) *(M1)*

\( = 3.03\) *A1*

*[3 marks]*

**Note:** Accept any notation for these estimates including \(\mu \) and \({\sigma ^2}\).

**Note:** Award ** M0A0** for division by 20.

the 95% confidence limits are

\(\bar x \pm t\sqrt {\frac{{s_{n – 1}^2}}{n}} \) *(M1)*

**Note:** Award ** M0** for use of

*z*.

* *

*ie*, \(14 \pm 2.093\sqrt {\frac{{3.03}}{{20}}} \) *(A1)*

**Note:**** FT** their mean and variance from (a).

giving [13.2, 14.8] *A1*

**Note:** Accept any answers which round to 13.2 and 14.8.

* *

*[3 marks]*

Use of t-statistic \(\left( { = \frac{{14 – 15}}{{\sqrt {\frac{{3.03}}{{20}}} }}} \right)\) *(M1)*

**Note:**** FT** their mean and variance from (a).

** **

**Note:** Award ** M0** for use of

*z*.

** **

**Note:** Accept \(\frac{{15 – 14}}{{\sqrt {\frac{{3.03}}{{20}}} }}\).

\( = – 2.569 \ldots \) *(A1)*

**Note:** Accept \(2.569 \ldots \)

\(p{\text{ – value}} = 0.009392 \ldots \times 2 = 0.0188\) *A1*

**Note:** Accept any answer that rounds to 0.019.

** **

**Note:** Award ** (M1)(A1)A0** for any answer that rounds to 0.0094.

insufficient evidence to reject \({{\text{H}}_0}\) (or equivalent, *eg* accept \({{\text{H}}_0}\) or reject \({{\text{H}}_1}\)) *R1*

**Note:**** FT** on their

*p*-value.

* *

*[4 marks]*

## Examiners report

In (a), most candidates estimated the mean correctly although many candidates failed to obtain a correct unbiased estimate for the variance. The most common error was to divide \(\sum {{x^2}} \) by \(20\) instead of \(19\). For some candidates, this was not a costly error since we followed through their variance into (b) and (c).

In (b) and (c), since the variance was estimated, the confidence interval and test should have been carried out using the t-distribution. It was extremely disappointing to note that many candidates found a Z-interval and used a Z-test and no marks were awarded for doing this. Candidates should be aware that having to estimate the variance is a signpost pointing towards the t-distribution.

In (b) and (c), since the variance was estimated, the confidence interval and test should have been carried out using the t-distribution. It was extremely disappointing to note that many candidates found a Z-interval and used a Z-test and no marks were awarded for doing this. Candidates should be aware that having to estimate the variance is a signpost pointing towards the t-distribution.

## Question

(a) Consider the random variable \(X\) for which \({\text{E}}(X) = a\lambda + b\), where \(a\) and \(b\)are constants and \(\lambda \) is a parameter.

Show that \(\frac{{X – b}}{a}\) is an unbiased estimator for \(\lambda \).

(b) The continuous random variable *Y *has probability density function

\(f(y) = \left\{ \begin{array}{r}{\textstyle{2 \over 9}}(3 + y – \lambda ),\\0,\end{array} \right.\begin{array}{*{20}{l}}{{\rm{ for}}\, \lambda – 3 \le y \le \lambda }\\{{\rm{ otherwise}}}\end{array}\)

where \(\lambda \) is a parameter.

(i) Verify that \(f(y)\) is a probability density function for all values of \(\lambda \).

(ii) Determine \({\text{E}}(Y)\).

(iii) Write down an unbiased estimator for \(\lambda \).

**Answer/Explanation**

## Markscheme

(a) \({\text{E}}\left( {\frac{{X – b}}{a}} \right) = \frac{{a\lambda + b – b}}{a}\) *M1A1*

\( = \lambda \) *A1*

(Therefore \(\frac{{X – b}}{a}\) is an unbiased estimator for \(\lambda \)) *AG*

*[3 marks]*

* *

(b) (i) \(f(y) \geqslant 0\) *R1*

**Note: **Only award ** R1 **if this statement is made explicitly.

recognition or showing that integral of *f *is 1 (seen anywhere) *R1*

** EITHER**

\(\int_{\lambda – 3}^\lambda {\frac{2}{9}(3 + y – \lambda ){\text{d}}y} \) **M1**

\( = \frac{2}{9}\left[ {(3 – \lambda )y + \frac{1}{2}{y^2}} \right]_{\lambda – 3}^\lambda \) **A1**

\( = \frac{2}{9}\left( {\lambda (3 – \lambda ) + \frac{1}{2}{\lambda ^2} – (3 – \lambda )(\lambda – 3) – \frac{1}{2}{{(\lambda – 3)}^2}} \right)\) or equivalent **A1**

\( = 1\)

** OR**

the graph of the probability density is a triangle with base length 3 and height \(\frac{2}{3}\) *M1A1*

its area is therefore \(\frac{1}{2} \times 3 \times \frac{2}{3}\) **A1**

\( = 1\)

(ii) \({\text{E}}(Y) = \int_{\lambda – 3}^\lambda {\frac{2}{9}y(3 + y – \lambda ){\text{d}}y} \) **M1**

\( = \frac{2}{9}\left[ {(3 – \lambda )\frac{1}{2}{y^2} + \frac{1}{3}{y^3}} \right]_{\lambda – 3}^\lambda \) **A1**

\( = \frac{2}{9}\left( {(3 – \lambda )\frac{1}{2}\left( {{\lambda ^2} – {{(\lambda – 3)}^2}} \right) + \frac{1}{3}\left( {{\lambda ^3} – {{(\lambda – 3)}^3}} \right)} \right)\) **M1**

\( = \lambda – 1\) **A1A1**

** **

**Note: **Award 3 marks for noting that the mean is \(\frac{2}{3}{\text{rds}}\) the way along the base and then ** A1A1 **for \(\lambda – 1\).

**Note: **Award ** A1 **for \(\lambda \) and

**for –1.**

*A1* (iii) unbiased estimator: \(Y + 1\) *A1*

**Note: **Accept \(\bar Y + 1\).

Follow through their \({\text{E}}(Y)\) if linear.

*[11 marks]*

* *

*Total [14 marks]*

## Examiners report

## Question

If \(X\) is a random variable that follows a Poisson distribution with mean \(\lambda > 0\) then the probability generating function of \(X\) is \(G(t) = {e^{\lambda (t – 1)}}\).

(i) Prove that \({\text{E}}(X) = \lambda \).

(ii) Prove that \({\text{Var}}(X) = \lambda \).

\(Y\) is a random variable, independent of \(X\), that also follows a Poisson distribution with mean \(\lambda \).

If \(S = 2X – Y\) find

(i) \({\text{E}}(S)\);

(ii) \({\text{Var}}(S)\).

Let \(T = \frac{Y}{2} + \frac{Y}{2}\).

(i) Show that \(T\) is an unbiased estimator for \(\lambda \).

(ii) Show that \(T\) is a more efficient unbiased estimator of \(\lambda \) than \(S\).

Could either \(S\) or \(T\) model a Poisson distribution? Justify your answer.

By consideration of the probability generating function, \({G_{X + Y}}(t)\), of \(X + Y\), prove that \(X + Y\) follows a Poisson distribution with mean \(2\lambda \).

Find

(i) \({G_{X + Y}}(1)\);

(ii) \({G_{X + Y}}( – 1)\).

Hence find the probability that \(X + Y\) is an even number.

**Answer/Explanation**

## Markscheme

(i) \(G'(t) = \lambda {e^{\lambda (t – 1)}}\) *A1*

\({\text{E}}(X) = G'(1)\) *M1*

\( = \lambda \) *AG*

(ii) \(G”(t) = {\lambda ^2}{e^{\lambda (t – 1)}}\) *M1*

\( \Rightarrow G”(1) = {\lambda ^2}\) *(A1)*

\({\text{Var}}(X) = G”(1) + G'(1) – {\left( {G'(1)} \right)^2}\) *(M1)*

\( = {\lambda ^2} + \lambda – {\lambda ^2}\) *A1*

\( = \lambda \) *AG*

*[6 marks]*

(i) \({\text{E}}(S) = 2\lambda – \lambda = \lambda \) *A1*

(ii) \({\text{Var}}(S) = 4\lambda + \lambda = 5\lambda \) *(A1)A1*

**Note: **First ** A1 **can be awarded for either \(4\lambda \) or \(\lambda \).

**[3 marks]**

(i) \({\text{E}}(T) = \frac{\lambda }{2} + \frac{\lambda }{2} = \lambda \;\;\;\)(so *\(T\) *is an unbiased estimator) *A1*

(ii) \({\text{Var}}(T) = \frac{1}{4}\lambda + \frac{1}{4}\lambda = \frac{1}{2}\lambda \) *A1*

this is less than \({\text{Var}}(S)\)*, *therefore \(T\) is the more efficient estimator *R1AG*

**Note: **Follow through their variances from (b)(ii) and (c)(ii).

**[3 marks]**

no, mean does not equal the variance *R1*

*[1 mark]*

\({G_{X + Y}}(t) = {e^{\lambda (t – 1)}} \times {e^{\lambda (t – 1)}} = {e^{2\lambda (t – 1)}}\) *M1A1*

which is the probability generating function for a Poisson with a mean of \(2\lambda \) *R1AG*

*[3 marks]*

(i) \({G_{X + Y}}(1) = 1\) *A1*

(ii) \({G_{X + Y}}( – 1) = {e^{ – 4\lambda }}\) *A1*

*[2 marks]*

\({G_{X + Y}}(1) = p(0) + p(1) + p(2) + p(3) \ldots \)

\({G_{X + Y}}( – 1) = p(0) – p(1) + p(2) – p(3) \ldots \)

so \({\text{2P(even)}} = {G_{X + Y}}(1) + {G_{X + Y}}( – 1)\) *(M1)(A1)*

\({\text{P(even)}} = \frac{1}{2}(1 + {e^{ – 4\lambda }})\) *A1*

*[3 marks]*

*Total [21 marks]*

## Examiners report

Solutions to the different parts of this question proved to be extremely variable in quality with some parts well answered by the majority of the candidates and other parts accessible to only a few candidates. Part (a) was well answered in general although the presentation was sometimes poor with some candidates doing the differentiation of \(G(t)\) and the substitution of \(t = 1\) simultaneously.

Part (b) was well answered in general, the most common error being to state that \({\text{Var}}(2X – Y) = {\text{Var}}(2X) – {\text{Var}}(Y)\).

Parts (c) and (d) were well answered by the majority of candidates.

Parts (c) and (d) were well answered by the majority of candidates.

Solutions to (e), however, were extremely disappointing with few candidates giving correct solutions. A common incorrect solution was the following:

\(\;\;\;{G_{X + Y}}(t) = {G_X}(t){G_Y}(t)\)

Differentiating,

\(\;\;\;{G’_{X + Y}}(t) = {G’_X}(t){G_Y}(t) + {G_X}(t){G’_Y}(t)\)

\(\;\;\;{\text{E}}(X + Y) = {G’_{X + Y}}(1) = {\text{E}}(X) \times 1 + {\text{E}}(Y) \times 1 = 2\lambda \)

This is correct mathematics but it does not show that \(X + Y\) is Poisson and it was given no credit. Even the majority of candidates who showed that \({G_{X + Y}}(t) = {{\text{e}}^{2\lambda (t – 1)}}\) failed to state that this result proved that \(X + Y\) is Poisson and they usually differentiated this function to show that \({\text{E}}(X + Y) = 2\lambda \).

In (f), most candidates stated that \({G_{X + Y}}(1) = 1\) even if they were unable to determine \({G_{X + Y}}(t)\) but many candidates were unable to evaluate \({G_{X + Y}}( – 1)\). Very few correct solutions were seen to (g) even if the candidates correctly evaluated \({G_{X + Y}}(1)\) and \({G_{X + Y}}( – 1)\).

[N/A]

## Question

A random variable \(X\) has a population mean \(\mu \).

Explain briefly the meaning of

(i) an estimator of \(\mu \);

(ii) an unbiased estimator of \(\mu \).

A random sample \({X_1},{\text{ }}{X_2},{\text{ }}{X_3}\) of three independent observations is taken from the distribution of \(X\).

An unbiased estimator of \(\mu ,{\text{ }}\mu \ne 0\), is given by \(U = \alpha {X_1} + \beta {X_2} + (\alpha – \beta ){X_3}\),

where \(\alpha ,{\text{ }}\beta \in \mathbb{R}\).

(i) Find the value of \(\alpha \).

(ii) Show that \({\text{Var}}(U) = {\sigma ^2}\left( {2{\beta ^2} – \beta + \frac{1}{2}} \right)\) where \({\sigma ^2} = {\text{Var}}(X)\).

(iii) Find the value of \(\beta \) which gives the most efficient estimator of \(\mu \) of this form.

(iv) Write down an expression for this estimator and determine its variance.

(v) Write down a more efficient estimator of \(\mu \) than the one found in (iv), justifying your answer.

**Answer/Explanation**

## Markscheme

(i) an estimator \(T\) is a formula (or statistic) that can be applied to the values in any sample, taken from \(X\) *A1*

to estimate the value of \(\mu \) *A1*

(ii) an estimator is unbiased if \({\text{E}}(T) = \mu \) *A1*

*[3 marks]*

(i) using linearity and the definition of an unbiased estimator *M1*

\(\mu = \alpha \mu + \beta \mu + (\alpha – \beta )\mu \) *A1*

obtain \(\alpha = \frac{1}{2}\) *A1*

(ii) attempt to compute \({\text{Var}}(U)\) using correct formula *M1*

\({\text{Var}}(U) = \frac{1}{4}{\sigma ^2} + {\beta ^2}{\sigma ^2} + {\left( {\frac{1}{2} – \beta } \right)^2}{\sigma ^2}\) *A1*

\({\text{Var}}(U) = {\sigma ^2}\left( {2{\beta ^2} – \beta + \frac{1}{2}} \right)\) *AG*

(iii) attempt to minimise quadratic in \(\beta \) (or equivalent) *(M1)*

\(\beta = \frac{1}{4}\) *A1*

(iv) \((U) = \frac{1}{2}{X_1} + \frac{1}{4}{X_2} + \frac{1}{4}{X_3}\) *A1*

\({\text{Var}}(U) = \frac{3}{8}{\sigma ^2}\) *A1*

(v) \(\frac{1}{3}{X_1} + \frac{1}{3}{X_2} + \frac{1}{3}{X_3}\) *A1*

\({\text{Var}}\left( {\frac{1}{3}{X_1} + \frac{1}{3}{X_2} + \frac{1}{3}{X_3}} \right) = \frac{3}{9}{\sigma ^2}\) *A1*

\( < {\text{Var}}(U)\) *R1*

**Note:** Accept \(\sum\limits_{i = 1}^3 {{\lambda _i}{X_i}} \) if \(\sum\limits_{i = 1}^3 {{\lambda _i} = 1} \) and \(\sum\limits_{i = 1}^3 {\lambda _i^2 < \frac{3}{8}} \) and follow through to the variance if this is the case.

**[12 marks]**

**Total [15 marks]**

## Examiners report

In general, solutions to (a) were extremely disappointing with the vast majority unable to give correct explanations of estimators and unbiased estimators. Solutions to (b) were reasonably good in general, indicating perhaps that the poor explanations in (a) were due to an inability to explain what they know rather than a lack of understanding.

Solutions to (b) were reasonably good in general, indicating perhaps that the poor explanations in (a) were due to an inability to explain what they know rather than a lack of understanding.

## Question

A biased cubical die has its faces labelled \(1,{\rm{ }}2,{\rm{ }}3,{\rm{ }}4,{\rm{ }}5\) and \(6\). The probability of rolling a \(6\) is \(p\), with equal probabilities for the other scores.

The die is rolled once, and the score \({X_1}\) is noted.

(i) Find \({\text{E}}({X_1})\).

(ii) Hence obtain an unbiased estimator for \(p\).

The die is rolled a second time, and the score \({X_2}\) is noted.

(i) Show that \(k({X_1} – 3) + \left( {\frac{1}{3} – k} \right)({X_2} – 3)\) is also an unbiased estimator for \(p\) for all values of \(k \in \mathbb{R}\).

(ii) Find the value for \(k\), which maximizes the efficiency of this estimator.

**Answer/Explanation**

## Markscheme

let \(X\) denote the score on the die

(i) \({\text{P}}(X = x) = \left\{ {\begin{array}{*{20}{c}} {\frac{{1 – p}}{5},}&{x = 1,{\text{ 2}},{\text{ 3}},{\text{ 4}},{\text{ 5}}} \\ {p,}&{x = 6} \end{array}} \right.\) *(M1)*

\(E({X_1}) = (1 + 2 + 3 + 4 + 5)\frac{{1 – p}}{5} + 6p\) *M1*

\( = 3 + 3p\) *A1*

(ii) so an unbiased estimator for \(p\) would be \(\frac{{{X_1} – 3}}{3}\) *A1*

*[4 marks]*

(i) \(E\left( {k({X_1} – 3) + \left( {\frac{1}{3} – k} \right)({X_2} – 3)} \right)\) *M1*

\( = kE({X_1} – 3) + \left( {\frac{1}{3} – k} \right)E({X_2} – 3)\) *M1*

\( = k(3p) + \left( {\frac{1}{3} – k} \right)(3p)\) *A1*

any correct expression involving just \(k\) and \(p\)

\( = p\) *AG*

hence \(k({X_1} – 3) + \left( {\frac{1}{3} – k} \right)({X_2} – 3)\) is an unbiased estimator of \(p\)

(ii) \({\text{Var}}\left( {k({X_1} – 3) + \left( {\frac{1}{3} – k} \right)({X_2} – 3)} \right)\) *M1*

\( = {k^2}{\text{Var}}({X_1} – 3) + {\left( {\frac{1}{3} – k} \right)^2}{\text{Var}}({X_2} – 3)\) *A1*

\( = \left( {{k^2} + {{\left( {\frac{1}{3} – k} \right)}^2}} \right){\sigma ^2}\) (where \({\sigma ^2}\) denotes \({\text{Var}}(X)\))

valid attempt to minimise the variance *M1*

\(k = \frac{1}{6}\) *A1*

**Note: **Accept an argument which states that the most efficient estimator is the one having equal coefficients of \({X_1}\) and \({X_2}\).

**[7 marks]**

**Total [11 marks]**

## Examiners report

[N/A]

[N/A]

## Question

A biased cubical die has its faces labelled \(1,{\rm{ }}2,{\rm{ }}3,{\rm{ }}4,{\rm{ }}5\) and \(6\). The probability of rolling a \(6\) is \(p\), with equal probabilities for the other scores.

The die is rolled once, and the score \({X_1}\) is noted.

(i) Find \({\text{E}}({X_1})\).

(ii) Hence obtain an unbiased estimator for \(p\).

The die is rolled a second time, and the score \({X_2}\) is noted.

(i) Show that \(k({X_1} – 3) + \left( {\frac{1}{3} – k} \right)({X_2} – 3)\) is also an unbiased estimator for \(p\) for all values of \(k \in \mathbb{R}\).

(ii) Find the value for \(k\), which maximizes the efficiency of this estimator.

**Answer/Explanation**

## Markscheme

let \(X\) denote the score on the die

(i) \({\text{P}}(X = x) = \left\{ {\begin{array}{*{20}{c}} {\frac{{1 – p}}{5},}&{x = 1,{\text{ 2}},{\text{ 3}},{\text{ 4}},{\text{ 5}}} \\ {p,}&{x = 6} \end{array}} \right.\) *(M1)*

\(E({X_1}) = (1 + 2 + 3 + 4 + 5)\frac{{1 – p}}{5} + 6p\) *M1*

\( = 3 + 3p\) *A1*

(ii) so an unbiased estimator for \(p\) would be \(\frac{{{X_1} – 3}}{3}\) *A1*

*[4 marks]*

(i) \(E\left( {k({X_1} – 3) + \left( {\frac{1}{3} – k} \right)({X_2} – 3)} \right)\) *M1*

\( = kE({X_1} – 3) + \left( {\frac{1}{3} – k} \right)E({X_2} – 3)\) *M1*

\( = k(3p) + \left( {\frac{1}{3} – k} \right)(3p)\) *A1*

any correct expression involving just \(k\) and \(p\)

\( = p\) *AG*

hence \(k({X_1} – 3) + \left( {\frac{1}{3} – k} \right)({X_2} – 3)\) is an unbiased estimator of \(p\)

(ii) \({\text{Var}}\left( {k({X_1} – 3) + \left( {\frac{1}{3} – k} \right)({X_2} – 3)} \right)\) *M1*

\( = {k^2}{\text{Var}}({X_1} – 3) + {\left( {\frac{1}{3} – k} \right)^2}{\text{Var}}({X_2} – 3)\) *A1*

\( = \left( {{k^2} + {{\left( {\frac{1}{3} – k} \right)}^2}} \right){\sigma ^2}\) (where \({\sigma ^2}\) denotes \({\text{Var}}(X)\))

valid attempt to minimise the variance *M1*

\(k = \frac{1}{6}\) *A1*

**Note: **Accept an argument which states that the most efficient estimator is the one having equal coefficients of \({X_1}\) and \({X_2}\).

**[7 marks]**

**Total [11 marks]**

## Examiners report

[N/A]

[N/A]

## Question

The random variable *X* has a binomial distribution with parameters \(n\) and \(p\).

Let \(U = nP\left( {1 – P} \right)\).

Show that \(P = \frac{X}{n}\) is an unbiased estimator of \(p\).

Show that \({\text{E}}\left( U \right) = \left( {n – 1} \right)p\left( {1 – p} \right)\).

Hence write down an unbiased estimator of Var(*X*).

**Answer/Explanation**

## Markscheme

\({\text{E}}\left( P \right) = {\text{E}}\left( {\frac{X}{n}} \right) = \frac{1}{n}{\text{E}}\left( X \right)\) **M1**

\( = \frac{1}{n}\left( {np} \right) = p\) **A1**

so *P* is an unbiased estimator of \(p\) **AG**

**[2 marks]**

\({\text{E}}\left( {nP\left( {1 – P} \right)} \right) = {\text{E}}\left( {n\left( {\frac{X}{n}} \right)\left( {1 – \frac{X}{n}} \right)} \right)\)

\( = {\text{E}}\left( X \right) = \frac{1}{n}{\text{E}}\left( {{X^2}} \right)\) **M1A1**

use of \({\text{E}}\left( {{X^2}} \right) = {\text{Var}}\left( X \right) + {\left( {{\text{E}}\left( X \right)} \right)^2}\) **M1**

**Note:** Allow candidates to work with *P* rather than *X* for the above 3 marks.

\( = np – \frac{1}{n}\left( {np\left( {1 – p} \right) + {{\left( {np} \right)}^2}} \right)\) **A1**

\( = np – p\left( {1 – p} \right) – n{p^2}\)

\( = np\left( {1 – p} \right) – p\left( {1 – p} \right)\) **A1**

**Note:** Award * A1* for the factor of \(\left( {1 – p} \right)\).

\( = \left( {n – 1} \right)p\left( {1 – p} \right)\) **AG**

**[5 marks]**

an unbiased estimator is \(\frac{{{n^2}P\left( {1 – P} \right)}}{{n – 1}}\left( { = \frac{{nU}}{{n – 1}}} \right)\) **A1**

**[1 mark]**

## Examiners report

[N/A]

[N/A]

[N/A]

## Question

A shopper buys 12 apples from a market stall and weighs them with the following results (in grams).

117, 124, 129, 118, 124, 116, 121, 126, 118, 121, 122, 129

You may assume that this is a random sample from a normal distribution with mean \(\mu \) and variance \({\sigma ^2}\).

Determine unbiased estimates of \(\mu \) and \({\sigma ^2}\).

Determine a 99 % confidence interval for \(\mu \) .

The stallholder claims that the mean weight of apples is 125 grams but the shopper claims that the mean is less than this.

(i) State suitable hypotheses for testing these claims.

(ii) Calculate the *p*-value of the above sample.

(iii) Giving a reason, state which claim is supported by your *p*-value using a 5 % significance level.

**Answer/Explanation**

## Markscheme

unbiased estimate of \(\mu = 122\) *A1*

unbiased estimate of \({\sigma ^2} = 4.4406{ \ldots ^2} = 19.7\) *(M1)A1*

**Note:** Award ** (M1)A0** for 4.44.

* *

*[3 marks]*

the 99 % confidence interval for \(\mu \) is [118, 126] *A1A1*

*[2 marks]*

(i) \({{\text{H}}_0}:\mu = 125;{\text{ }}{{\text{H}}_1}:\mu < 125\) *A1*

* *

(ii) *p*-value = 0.0220 *A2*

* *

(iii) the shopper’s claim is supported because \(0.0220 < 0.05\) *A1R1*

*[5 marks]*

## Examiners report

[N/A]

[N/A]

[N/A]

## Question

The discrete random variable *X* has the following probability distribution, where \(0 < \theta < \frac{1}{3}\).

Determine \({\text{E}}(X)\) and show that \({\text{Var}}(X) = 6\theta – 16{\theta ^2}\).

In order to estimate \(\theta \), a random sample of *n* observations is obtained from the distribution of *X* .

(i) Given that \({\bar X}\) denotes the mean of this sample, show that

\[{{\hat \theta }_1} = \frac{{3 – \bar X}}{4}\]

is an unbiased estimator for \(\theta \) and write down an expression for the variance of \({{\hat \theta }_1}\) in terms of *n* and \(\theta \).

(ii) Let *Y* denote the number of observations that are equal to 1 in the sample. Show that *Y* has the binomial distribution \({\text{B}}(n,{\text{ }}\theta )\) and deduce that \({{\hat \theta }_2} = \frac{Y}{n}\) is another unbiased estimator for \(\theta \). Obtain an expression for the variance of \({{\hat \theta }_2}\).

(iii) Show that \({\text{Var}}({{\hat \theta }_1}) < {\text{Var}}({{\hat \theta }_2})\) and state, with a reason, which is the more efficient estimator, \({{\hat \theta }_1}\) or \({{\hat \theta }_2}\).

**Answer/Explanation**

## Markscheme

\({\text{E}}(X) = 1 \times \theta + 2 \times 2\theta + 3(1 – 3\theta ) = 3 – 4\theta \) *M1A1*

\({\text{Var}}(X) = 1 \times \theta + 4 \times 2\theta + 9(1 – 3\theta ) – {(3 – 4\theta )^2}\) *M1A1*

\( = 6\theta – 16{\theta ^2}\) *AG*

*[4 marks]*

(i) \({\text{E}}({\hat \theta _1}) = \frac{{3 – {\text{E}}(\bar X)}}{4} = \frac{{3 – (3 – 4\theta )}}{4} = \theta \) *M1A1*

so \({\hat \theta _1}\) is an unbiased estimator of \(\theta \) *AG*

\({\text{Var}}({{\hat \theta }_1}) = \frac{{6\theta – 16{\theta ^2}}}{{16n}}\) *A1*

(ii) each of the *n* observed values has a probability \(\theta \) of having the value 1 *R1*

so \(Y \sim {\text{B}}(n,{\text{ }}\theta )\) *AG*

\({\text{E}}({{\hat \theta }_2}) = \frac{{{\text{E}}(Y)}}{n} = \frac{{n\theta }}{n} = \theta \) *A1*

\({\text{Var}}({{\hat \theta }_2}) = \frac{{n\theta (1 – \theta )}}{{{n^2}}} = \frac{{\theta (1 – \theta )}}{n}\) *M1A1*

(iii) \({\text{Var}}({{\hat \theta }_1}) – {\text{Var}}({{\hat \theta }_2}) = \frac{{6\theta – 16{\theta ^2} – 16\theta + 16{\theta ^2}}}{{16n}}\) *M1*

\( = \frac{{ – 10\theta }}{{16n}} < 0\) *A1*

\({{\hat \theta }_1}\) is the more efficient estimator since it has the smaller variance *R1*

*[10 marks]*

## Examiners report

[N/A]

[N/A]