Question 1 [Maximum mark: 18]
As part of his mathematics exploration about classic books, Jason investigated the time taken by students in his school to read the book The Old Man and the Sea. He collected his data by
stopping and asking students in the school corridor, until he reached his target of 10 students from each of the literature classes in his school.
State which of the two sampling methods, systematic or quota, Jason has used. [1]
Jason constructed the following box and whisker diagram to show the number of hours students in the sample took to read this book.
Write down the median time to read the book. [1]
Calculate the interquartile range. [2]
Mackenzie, a member of the sample, took 25 hours to read the novel. Jason believes Mackenzie’s time is not an outlier.
Determine whether Jason is correct. Support your reasoning. [4] For each student interviewed, Jason recorded the time taken to read The Old Man and the
Sea ( x ), measured in hours, and paired this with their percentage score on the final exam ( y ).
These data are represented on the scatter diagram.
Describe the correlation. [1]
Jason correctly calculates the equation of the regression line y on x for these students to be
y = -1.54x + 98.8 .
He uses the equation to estimate the percentage score on the final exam for a student who read the book in 1.5 hours.
Find the percentage score calculated by Jason. [2]
State whether it is valid to use the regression line y on x for Jason’s estimate. Give a reason for your answer. [2]
Jason found a website that rated the ‘top 50’ classic books. He randomly chose eight of these classic books and recorded the number of pages. For example, Book H is rated 44th and has 281 pages. These data are shown in the table.
Jason intends to analyse the data using Spearman’s rank correlation coefficient, rs .
Copy and complete the information in the following table. [2]
(i) Calculate the value of rs .
(ii) Interpret your result. [3]
Answer/Explanation
(a) Quota sampling
(b) 10(hours) (c) 15 – 7 = 8
(d) indication of a valid attempt to find the upper fence 15 + 1.5 × 8 = 27 25 27 < (accept equivalent answer in words) Jason is correct
(e) “negative” seen (f) correct substitution y =− 1.54 × 1.5 + 98.8 96.5 (%) (96.49)
(g) not reliable extrapolation OR outside the given range of the data
(h)
(i) (i) 0.714 (0.714285…)
(ii) EITHER
there is a (strong/moderate) positive association between the number of pages an the top 50 rating. OR there is a (strong/moderate) agreement between the rank order of number of pages and the rank order top 50 rating. OR there is a (strong/moderate) positive (linear) correlation between the rank order of number of pages and the rank order top 50 rating.
Question
The scores of the eight highest scoring countries in the 2019 Eurovision song contest are
shown in the following table.
(a) For this data, find
(i) the upper quartile.
(ii) the interquartile range. [4]
(b) Determine if the Netherlands’ score is an outlier for this data. Justify your answer. [3]
(Question 3 continued)
Chester is investigating the relationship between the highest-scoring countries’ Eurovision
score and their population size to determine whether population size can reasonably be
used to predict a country’s score.
The populations of the countries, to the nearest million, are shown in the table.
Chester finds that, for this data, the Pearson’s product moment correlation coefficient
is r = 0.249.
(c) State whether it would be appropriate for Chester to use the equation of a regression
line for y on x to predict a country’s Eurovision score. Justify your answer. [2]
Chester then decides to find the Spearman’s rank correlation coefficient for this data,
and creates a table of ranks.
(Question 3 continued)
(d) Write down the value of:
(i) a ,
(ii) b ,
(iii) c . [3]
(e) (i) Find the value of the Spearman’s rank correlation coefficient rs .
(ii) Interpret the value obtained for rs . [3]
(f) When calculating the ranks, Chester incorrectly read the Netherlands’ score as 478.
Explain why the value of the Spearman’s rank correlation rs does not change despite
this error. [1]
Answer/Explanation
Ans
3. (a) (i) \(\frac{370+472}{2}\) (M1)
Note: This (M1) can also be awarded for either a correct Q3 or a correct Q1
in part (a)(ii).
Q3 = 421 A1
(ii) their part (a)(i) – their Q1 (clearly stated) (M1)
IQR = (421 – 318 = ) 103 A1
[4 marks]
(b) (Q3 + 1.5 (IQR) =) 421 + 1.5 × 103) (M1)
= 575.5
since 498<575.5 R1
Netherlands is not an outlier A1
Note: The R1 is dependent on the (M1). Do not award R0A1 [3 marks]
(c) not appropriate (“no” is sufficient) A1
as r is too close to zero / too weak a correlation R1
[2 marks]
(d) (i) 6 A1
(ii) 4.5 A1
(iii) 4.5 A1
[3 marks]
(e) (i) rs = 0.683 (0.682646…) A2
(ii) EITHER
there is a (positive) association between the population size and
the score A1
OR
there is a (positive) linear correlation between the ranks of the population size
and the ranks of the scores (when compared with the PMCC of 0.249). A1
[3 marks]
(f) lowering the top score by 20 does not change its rank so rs is unchanged R1
Note: Accept “this would not alter the rank” or “Netherlands still top rank” or similar.
Condone any statement that clearly implies the ranks have not changed, for
example: “The Netherlands still has the highest score.”
[1 mark]
[Total 16 marks]
Question
Daniel grows apples and chooses at random a sample of 100 apples from his harvest.
He measures the diameters of the apples to the nearest cm. The following table shows the distribution of the diameters.
Using your graphic display calculator, write down the value of
(i) the mean of the diameters in this sample;
(ii) the standard deviation of the diameters in this sample.[3]
Daniel assumes that the diameters of all of the apples from his harvest are normally distributed with a mean of 7 cm and a standard deviation of 1.2 cm. He classifies the apples according to their diameters as shown in the following table.
Calculate the percentage of small apples in Daniel’s harvest.[3]
Daniel assumes that the diameters of all of the apples from his harvest are normally distributed with a mean of 7 cm and a standard deviation of 1.2 cm. He classifies the apples according to their diameters as shown in the following table.
Of the apples harvested, 5% are large apples.
Find the value of \(a\).[2]
Daniel assumes that the diameters of all of the apples from his harvest are normally distributed with a mean of 7 cm and a standard deviation of 1.2 cm. He classifies the apples according to their diameters as shown in the following table.
Find the percentage of medium apples.[2]
Daniel assumes that the diameters of all of the apples from his harvest are normally distributed with a mean of 7 cm and a standard deviation of 1.2 cm. He classifies the apples according to their diameters as shown in the following table.
This year, Daniel estimates that he will grow \({\text{100}}\,{\text{000}}\) apples.
Estimate the number of large apples that Daniel will grow this year.[2]
Answer/Explanation
Markscheme
(i) \(6.76{\text{ (cm)}}\) (G2)
Notes: Award (M1) for an attempt to use the formula for the mean with a least two rows from the table.
(ii) \(1.14{\text{ (cm)}}\;\;\;\left( {1.14122 \ldots {\text{ (cm)}}} \right)\) (G1)
\({\text{P}}({\text{diameter}} < 6.5) = 0.338\;\;\;(0.338461)\) (M1)(A1)
Notes: Award (M1) for attempting to use the normal distribution to find the probability or for correct region indicated on labelled diagram. Award (A1) for correct probability.
\(33.8(\% )\) (A1)(ft)(G3)
Notes: Award (A1)(ft) for converting their probability into a percentage.
\({\text{P}}({\text{diameter}} \geqslant a) = 0.05\) (M1)
Note: Award (M1) for attempting to use the normal distribution to find the probability or for correct region indicated on labelled diagram.
\(a = 8.97{\text{ (cm)}}\;\;\;(8.97382 \ldots )\) (A1)(G2)
\(100 – (5 + 33.8461 \ldots )\) (M1)
Note: Award (M1) for subtracting “\(5+\) their part (b)” from 100 or (M1) for attempting to use the normal distribution to find the probability \({\text{P}}\left( {6.5 \leqslant {\text{diameter}} < {\text{their part (c)}}} \right)\) or for correct region indicated on labelled diagram.
\( = 61.2(\% )\;\;\;\left( {61.1538 \ldots (\% )} \right)\) (A1)(ft)(G2)
Notes: Follow through from their answer to part (b). Percentage symbol is not required. Accept \(61.1(\%)\) (\(61.1209\ldots(\%)\)) if \(8.97\) used.
\(100\,000 \times 0.05\) (M1)
Note: Award (M1) for multiplying by \(0.05\) (or \(5\%\)).
\( = 5000\) (A1)(G2)