IB DP Mathematical Studies 2.5 Measures of central tendency Paper 2

Question

The number of bottles of water sold at a railway station on each day is given in the following table.

Write down

(i)     the mean temperature;

(ii)    the standard deviation of the temperatures.[2]

a.

Write down the correlation coefficient, \(r\), for the variables \(n\) and \(T\).[1]

b.

Comment on your value for \(r\).[2]

c.

The equation of the line of regression for \(n\) on \(T\) is \(n = dT – 100\).

(i)     Write down the value of \(d\).

(ii)    Estimate how many bottles of water will be sold when the temperature is \({19.6^ \circ }\).[2]

d.

On a day when the temperature was \({36^ \circ }\) Peter calculates that \(314\) bottles would be sold. Give one reason why his answer might be unreliable.[1]

e.
Answer/Explanation

Markscheme

(i)     19.2     (G1)

(ii)    1.45     (G1)[2 marks]

a.

\(r = 0.942\)     (G1)[1 mark]

b.

Strong, positive correlation.     (A1)(ft)(A1)(ft)[2 marks]

c.

(i)     \(d = 11.5\)     (G1)

(ii)    \(n = 11.5 \times 19.6 – 100\)

\( = 125\) (accept \(126\))     (A1)(ft)

Note: Answer must be a whole number.[2 marks]

d.

It is unreliable to extrapolate outside the values given (outlier).     (R1)[1 mark]

e.

Question

In a mountain region there appears to be a relationship between the number of trees growing in the region and the depth of snow in winter. A set of 10 areas was chosen, and in each area the number of trees was counted and the depth of snow measured. The results are given in the table below.

In a study on \(100\) students there seemed to be a difference between males and females in their choice of favourite car colour. The results are given in the table below. A \(\chi^2\) test was conducted.

Use your graphic display calculator to find the mean number of trees.[1]

A, a, i.

Use your graphic display calculator to find the mean depth of snow.[1]

A, a, iii.

Use your graphic display calculator to find the standard deviation of the depth of snow.[1]

A, a, iv.

The covariance, Sxy = 188.5.

Write down the product-moment correlation coefficient, r.[2]

A, b.

Write down the equation of the regression line of y on x.[2]

A, c.

If the number of trees in an area is 55, estimate the depth of snow.[2]

A, d.

Use the equation of the regression line to estimate the depth of snow in an area with 100 trees.[1]

A, e, i.

Decide whether the answer in (e)(i) is a valid estimate of the depth of snow in the area. Give a reason for your answer.[2]

A, e, ii.

Write down the total number of male students.[1]

B, a.

Show that the expected frequency for males, whose favourite car colour is blue, is 12.6.[2]

B, b.

The calculated value of \({\chi ^2}\) is \(1.367\) and the critical value of \({\chi ^2}\) is \(5.99\) at the \(5\%\) significance level.

Write down the null hypothesis for this test.[1]

B, c, i.

The calculated value of \({\chi ^2}\) is \(1.367\) and the critical value of \({\chi ^2}\) is \(5.99\) at the \(5\%\) significance level.

Write down the number of degrees of freedom.[1]

B, c, ii.

The calculated value of \({\chi ^2}\) is \(1.367\) and the critical value of \({\chi ^2}\) is \(5.99\) at the \(5\%\) significance level.

Determine whether the null hypothesis should be accepted at the \(5\%\) significance level. Give a reason for your answer.[2]

B, c, iv.
Answer/Explanation

Markscheme

50     (G1)[1 mark]

A, a, i.

30.5     (G1)[1 mark]

A, a, iii.

12.3     (G1)

Note: Award (A1)(ft) for 13.0 in (iv) but only if 17.7 seen in (a)(ii).[1 mark]

A, a, iv.

\(r = \frac{{188.5}}{{(16.79 \times 12.33)}}\)     (M1)

Note: Award (M1) for using their values in the correct formula.

= 0.911 (accept 0.912, 0.910)     (A1)(ft)(G2)[2 marks]

A, b.

y = 0.669x − 2.95     (G1)(G1)

Note: Award (G1) for 0.669x, (G1) for −2.95. If the answer is not in the form of an equation, award at most (G1)(G0).[2 marks]

A, c.

Depth = 0.669 × 55 − 2.95     (M1)

= 33.8     (A1)(ft)(G2)(ft)

Note: Follow through from their (c) even if no working seen.[2 marks]

A, d.

64.0 (accept 63.95, 63.9)     (A1)(ft)(G1)(ft)

Note: Follow through from their (c) even if no working seen.[1 mark]

A, e, i.

It is not valid. It lies too far outside the values that are given. Or equivalent.     (A1)(R1)

Note: Do not award (A1)(R0).[2 marks]

A, e, ii.

28     (A1)[1 mark]

B, a.

\(\frac{{28 \times 45}}{{100}}\left( {\frac{{28}}{{100}} \times \frac{{45}}{{100}} \times 100} \right)\)     (M1)(A1)(ft)

Note: Award (M1) for correct formula, (A1) for correct substitution.

= 12.6     (AG)

Note: Do not award (A1) unless 12.6 seen.[2 marks]

B, b.

the favourite car colour is independent of gender.     (A1)

Note: Accept there is no association between gender and favourite car colour.

Do not accept ‘not related’ or ‘not correlated’.[1 mark]

B, c, i.

\(2\)     (A1)[1 marks]

B, c, ii.

Accept the null hypothesis since \(1.367 < 5.991\)     (A1)(ft)(R1)

Note: Allow “Do not reject”. Follow through from their null hypothesis and their critical value.

Full credit for use of \(p\)-values from GDC [\(p = 0.505\)].

Do not award (A1)(R0). Award (R1) for valid comparison.[2 marks]

B, c, iv.

Question

Francesca is a chef in a restaurant. She cooks eight chickens and records their masses and cooking times. The mass m of each chicken, in kg, and its cooking time t, in minutes, are shown in the following table.

Draw a scatter diagram to show the relationship between the mass of a chicken and its cooking time. Use 2 cm to represent 0.5 kg on the horizontal axis and 1 cm to represent 10 minutes on the vertical axis.[4]

a.

Write down for this set of data

(i) the mean mass, \(\bar m\) ;

(ii) the mean cooking time, \(\bar t\) .[2]

b.

Label the point \({\text{M}}(\bar m,\bar t)\) on the scatter diagram.[1]

c.

Draw the line of best fit on the scatter diagram.[2]

d.

Using your line of best fit, estimate the cooking time, in minutes, for a 1.7 kg chicken.[2]

e.

Write down the Pearson’s product–moment correlation coefficient, r .[2]

f.

Using your value for r , comment on the correlation.[2]

g.

The cooking time of an additional 2.0 kg chicken is recorded. If the mass and cooking time of this chicken is included in the data, the correlation is weak.

(i) Explain how the cooking time of this additional chicken might differ from that of the other eight chickens.

(ii) Explain how a new line of best fit might differ from that drawn in part (d).[2]

h.
Answer/Explanation

Markscheme

(A1) for correct scales and labels (mass or m on the horizontals axis, time or t on the vertical axis)

(A3) for 7 or 8 correctly placed data points

(A2) for 5 or 6 correctly placed data points

(A1) for 3 or 4 correctly placed data points, (A0) otherwise.     (A4)

Note: If axes reversed award at most (A0)(A3)(ft). If graph paper not used, award at most (A1)(A0).

a.

(i) 1.91 (kg) (1.9125 kg)     (G1)

(ii) 83 (minutes)     (G1)

b.

Their mean point labelled.     (A1)(ft)

Note: Follow through from part (b). Accept any clear indication of the mean point. For example: circle around point, (m, t), M , etc.

c.

Line of best fit drawn on scatter diagram.     (A1)(ft)(A1)(ft)

Notes:Award (A1)(ft) for straight line through their mean point, (A1)(ft) for line of best fit with intercept 9(±2) . The second (A1)(ft) can be awarded even if the line does not reach the t-axis but, if extended, the t-intercept is correct.

d.

75     (M1)(A1)(ft)(G2)

Notes: Accept 74.77 from the regression line equation. Award (M1) for indication of the use of their graph to get an estimate OR for correct substitution of 1.7 in the correct regression line equation t = 38.5m + 9.32.

e.

0.960 (0.959614…)     (G2)

Note: Award (G0)(G1)(ft) for 0.95, 0.959

f.

Strong and positive     (A1)(ft)(A1)(ft)

Note: Follow through from their correlation coefficient in part (f).

g.

(i) Cooking time is much larger (or smaller) than the other eight     (A1)

(ii) The gradient of the new line of best fit will be larger (or smaller)     (A1)

Note: Some acceptable explanations may include but are not limited to:

The line of best fit may be further away from the plotted points
It may be steeper than the previous line (as the mean would change)
The t-intercept of the new line is smaller (larger)

Do not accept vague explanations, like:

The new line would vary
It would not go through all points
It would not fit the patterns
The line may be slightly tilted

h.

Question

In the month before their IB Diploma examinations, eight male students recorded the number of hours they spent on social media.

For each student, the number of hours spent on social media (\(x\)) and the number of IB Diploma points obtained (\(y\)) are shown in the following table.

N16/5/MATSD/SP2/ENG/TZ0/01

Use your graphic display calculator to find

Ten female students also recorded the number of hours they spent on social media in the month before their IB Diploma examinations. Each of these female students spent between 3 and 30 hours on social media.

The equation of the regression line y on x for these ten female students is

\[y =  – \frac{2}{3}x + \frac{{125}}{3}.\]

An eleventh girl spent 34 hours on social media in the month before her IB Diploma examinations.

On graph paper, draw a scatter diagram for these data. Use a scale of 2 cm to represent 5 hours on the \(x\)-axis and 2 cm to represent 10 points on the \(y\)-axis.[4]

a.

(i)     \({\bar x}\), the mean number of hours spent on social media;

(ii)     \({\bar y}\), the mean number of IB Diploma points.[2]

b.

Plot the point \((\bar x,{\text{ }}\bar y)\) on your scatter diagram and label this point M.[2]

c.

Write down the value of \(r\), the Pearson’s product–moment correlation coefficient, for these data.[2]

d.

Write down the equation of the regression line \(y\) on \(x\) for these eight male students.[2]

e.

Draw the regression line, from part (e), on your scatter diagram.[2]

f.

Use the given equation of the regression line to estimate the number of IB Diploma points that this girl obtained.[2]

g.

Write down a reason why this estimate is not reliable.[1]

h.
Answer/Explanation

Markscheme

N16/5/MATSD/SP2/ENG/TZ0/01.a/M     (A4)

Notes:     Award (A1) for correct scale and labelled axes.

Award (A3) for 7 or 8 points correctly plotted,

(A2) for 5 or 6 points correctly plotted,

(A1) for 3 or 4 points correctly plotted.

Award at most (A0)(A3) if axes reversed.

Accept \(x\) and \(y\) sufficient for labelling.

If graph paper is not used, award (A0).

If an inconsistent scale is used, award (A0). Candidates’ points should be read from this scale where possible and awarded accordingly.

A scale which is too small to be meaningful (ie mm instead of cm) earns (A0) for plotted points.[4 marks]

a.

(i)     \(\bar x = 21\)     (A1)

(ii)    \(\bar y = 31\)     (A1)[2 marks]

b.

\((\bar x,{\text{ }}\bar y)\) correctly plotted on graph     (A1)(ft)

this point labelled M     (A1)

Note:     Follow through from parts (b)(i) and (b)(ii).

Only accept M for labelling.[2 marks]

c.

\( – 0.973{\text{ }}( – 0.973388 \ldots )\)    (G2)

Note:     Award (G1) for 0.973, without minus sign.[2 marks]

d.

\(y =  – 0.761x + 47.0{\text{ }}(y =  – 0.760638 \ldots x + 46.9734 \ldots )\)    (A1)(A1)(G2)

Notes:     Award (A1) for \( – 0.761x\) and (A1) \( + 47.0\). Award a maximum of (A1)(A0) if answer is not an equation.[2 marks]

e.

line on graph     (A1)(ft)(A1)(ft)

Notes:     Award (A1)(ft) for straight line that passes through their M, (A1)(ft) for line (extrapolated if necessary) that passes through \((0,{\text{ }}47.0)\).

If M is not plotted or labelled, follow through from part (e).[2 marks]

f.

\(y =  – \frac{2}{3}(34) + \frac{{125}}{3}\)    (M1)

Note:     Award (M1) for correct substitution.

19 (points)     (A1)(G2)[2 marks]

g.

extrapolation     (R1)

OR

34 hours is outside the given range of data     (R1)

Note:     Do not accept ‘outlier’.[1 mark]

h.

Question

A group of 800 students answered 40 questions on a category of their choice out of History, Science and Literature.

For each student the category and the number of correct answers, \(N\), was recorded. The results obtained are represented in the following table.

N17/5/MATSD/SP2/ENG/TZ0/01

A \({\chi ^2}\) test at the 5% significance level is carried out on the results. The critical value for this test is 12.592.

State whether \(N\) is a discrete or a continuous variable.[1]

a.

Write down, for \(N\), the modal class;[1]

b.i.

Write down, for \(N\), the mid-interval value of the modal class.[1]

b.ii.

Use your graphic display calculator to estimate the mean of \(N\);[2]

c.i.

Use your graphic display calculator to estimate the standard deviation of \(N\).[1]

c.ii.

Find the expected frequency of students choosing the Science category and obtaining 31 to 40 correct answers.[2]

d.

Write down the null hypothesis for this test;[1]

e.i.

Write down the number of degrees of freedom.[1]

e.ii.

Write down the \(p\)-value for the test;[1]

f.i.

Write down the \({\chi ^2}\) statistic.[2]

f.ii.

State the result of the test. Give a reason for your answer.[2]

g.
Answer/Explanation

Markscheme

discrete     (A1)[1 mark]

a.

\(11 \leqslant N \leqslant 20\)     (A1)[1 mark]

b.i.

15.5     (A1)(ft)

Note:     Follow through from part (b)(i).[1 mark]

b.ii.

\(21.2{\text{ }}(21.2125)\)     (G2)[2 marks]

c.i.

\(9.60{\text{ }}(9.60428 \ldots )\)     (G1)[1 marks]

c.ii.

\(\frac{{260}}{{800}} \times \frac{{157}}{{800}} \times 800\)\(\,\,\,\)OR\(\,\,\,\)\(\frac{{260 \times 157}}{{800}}\)     (M1)

Note:     Award (M1) for correct substitution into expected frequency formula.

\( = 51.0{\text{ }}(51.025)\)     (A1)(G2)[2 marks]

d.

choice of category and number of correct answers are independent     (A1)

Notes:     Accept “no association” between (choice of) category and number of correct answers. Do not accept “not related” or “not correlated” or “influenced”.[1 mark]

e.i.

6     (A1)[1 mark]

e.ii.

\(0.0644{\text{ }}(0.0644123 \ldots )\)     (G1)[1 mark]

f.i.

\(11.9{\text{ }}(11.8924 \ldots )\)     (G2)[2 marks]

f.ii.

the null hypothesis is not rejected (the null hypothesis is accepted)     (A1)(ft)

OR

(choice of) category and number of correct answers are independent     (A1)(ft)

as \(11.9 < 12.592\)\(\,\,\,\)OR\(\,\,\,\)\(0.0644 > 0.05\)     (R1)

Notes:     Award (R1) for a correct comparison of either their \({\chi ^2}\) statistic to the \({\chi ^2}\) critical value or their \(p\)-value to the significance level. Award (A1)(ft) from that comparison.

Follow through from part (f). Do not award (A1)(ft)(R0).[2 marks]

g.

Question

On one day 180 flights arrived at a particular airport. The distance travelled and the arrival status for each incoming flight was recorded. The flight was then classified as on time, slightly delayed, or heavily delayed.

The results are shown in the following table.

A χ2 test is carried out at the 10 % significance level to determine whether the arrival status of incoming flights is independent of the distance travelled.

The critical value for this test is 7.779.

A flight is chosen at random from the 180 recorded flights.

State the alternative hypothesis.[1]

a.

Calculate the expected frequency of flights travelling at most 500 km and arriving slightly delayed.[2]

b.

Write down the number of degrees of freedom.[1]

c.

Write down the χ2 statistic.[2]

d.i.

Write down the associated p-value.[1]

d.ii.

State, with a reason, whether you would reject the null hypothesis.[2]

e.

Write down the probability that this flight arrived on time.[2]

f.

Given that this flight was not heavily delayed, find the probability that it travelled between 500 km and 5000 km.[2]

g.

Two flights are chosen at random from those which were slightly delayed.

Find the probability that each of these flights travelled at least 5000 km.[3]

h.
Answer/Explanation

Markscheme

The arrival status is dependent on the distance travelled by the incoming flight     (A1)

Note: Accept “associated” or “not independent”.[1 mark]

a.

\(\frac{{60 \times 45}}{{180}}\)  OR  \(\frac{{60}}{{180}} \times \frac{{45}}{{180}} \times 180\)     (M1)

Note: Award (M1) for correct substitution into expected value formula.

= 15     (A1) (G2)[2 marks]

b.

4     (A1)

Note: Award (A0) if “2 + 2 = 4” is seen.[1 mark]

c.

9.55 (9.54671…)    (G2)

Note: Award (G1) for an answer of 9.54.[2 marks]

d.i.

0.0488 (0.0487961…)     (G1)[1 mark]

d.ii.

Reject the Null Hypothesis     (A1)(ft)

Note: Follow through from their hypothesis in part (a).

9.55 (9.54671…) > 7.779     (R1)(ft)

OR

0.0488 (0.0487961…) < 0.1     (R1)(ft)

Note: Do not award (A1)(ft)(R0)(ft). Follow through from part (d). Award (R1)(ft) for a correct comparison, (A1)(ft) for a consistent conclusion with the answers to parts (a) and (d). Award (R1)(ft) for χ2calc > χ2crit , provided the calculated value is explicitly seen in part (d)(i).[2 marks]

e.

\(\frac{{52}}{{180}}\,\,\left( {0.289,\,\,\frac{{13}}{{45}},\,\,28.9\,{\text{% }}} \right)\)     (A1)(A1) (G2)

Note: Award (A1) for correct numerator, (A1) for correct denominator.[2 marks]

f.

\(\frac{{35}}{{97}}\,\,\left( {0.361,\,\,36.1\,{\text{% }}} \right)\)     (A1)(A1) (G2)

Note: Award (A1) for correct numerator, (A1) for correct denominator.[2 marks]

g.

\(\frac{{14}}{{45}} \times \frac{{13}}{{44}}\)     (A1)(M1)

Note: Award (A1) for two correct fractions and (M1) for multiplying their two fractions.

\( = \frac{{182}}{{1980}}\,\,\left( {0.0919,\,\,\frac{{91}}{{990}},\,0.091919 \ldots ,\,9.19\,{\text{% }}} \right)\)     (A1) (G2)[3 marks]

h.

Question

A transportation company owns 30 buses. The distance that each bus has travelled since being purchased by the company is recorded. The cumulative frequency curve for these data is shown.

It is known that 8 buses travelled more than m kilometres.

Find the number of buses that travelled a distance between 15000 and 20000 kilometres.[2]

a.

Use the cumulative frequency curve to find the median distance.[2]

b.i.

Use the cumulative frequency curve to find the lower quartile.[1]

b.ii.

Use the cumulative frequency curve to find the upper quartile.[1]

b.iii.

Hence write down the interquartile range.[1]

c.

Write down the percentage of buses that travelled a distance greater than the upper quartile.[1]

d.

Find the number of buses that travelled a distance less than or equal to 12 000 km.[1]

e.

Find the value of m.[2]

f.

The smallest distance travelled by one of the buses was 2500 km.
The longest distance travelled by one of the buses was 23 000 km.

On graph paper, draw a box-and-whisker diagram for these data. Use a scale of 2 cm to represent 5000 km.[4]

g.
Answer/Explanation

Markscheme

28 − 20     (A1)

Note: Award (A1) for 28 and 20 seen.

8     (A1)(G2)[2 marks]

a.

13500     (G2)

Note: Accept an answer in the range 13500 to 13750.[2 marks]

b.i.

10000     (G1)

Note: Accept an answer in the range 10000 to 10250.[1 mark]

b.ii.

16000     (G1)

Note: Accept an answer in the range 16000 to 16250.[1 mark]

b.iii.

6000     (A1)(ft)

Note: Follow through from their part (b)(ii) and (iii).[1 mark]

c.

25%     (A1)[1 mark]

d.

11     (G1)[1 mark]

e.

30 − 8  OR  22     (M1)

Note: Award (M1) for subtracting 30 − 8 or 22 seen.

15750     (A1)(G2)

Note: Accept 15750 ± 250.[2 marks]

f.

(A1)(A1)(A1)(A1)

Note: Award (A1) for correct label and scale; accept “distance” or “km” for label.

(A1)(ft) for correct median,
(A1)(ft) for correct quartiles and box,
(A1) for endpoints at 2500 and 23 000 joined to box by straight lines.
Accept ±250 for the median, quartiles and endpoints.
Follow through from their part (b).
The final (A1) is not awarded if the line goes through the box.[4 marks]

g.

Question

The lengths (\(l\)) in centimetres of \(100\) copper pipes at a local building supplier were measured. The results are listed in the table below.

Write down the mode.[1]

a.

Using your graphic display calculator, write down the value of
(i)     the mean;
(ii)    the standard deviation;
(iii)   the median.[4]

b.

Find the interquartile range.[2]

c.

Draw a box and whisker diagram for this data, on graph paper, using a scale of \(1{\text{ cm}}\) to represent \(5{\text{ cm}}\).[4]

d.

Sam estimated the value of the mean of the measured lengths to be \(43{\text{ cm}}\).

Find the percentage error of Sam’s estimated mean.[2]

e.
Answer/Explanation

Markscheme

\(47.5{\text{ (cm)}}\)     (A1)

a.

(i)     \(45.85{\text{ (cm)}}\)     (G2) 

Note: Accept \(45.9\) .

(ii)    \(17.1{\text{ }}(17.0888 \ldots )\)     (G1)
(iii)   \(47.5{\text{ (cm)}}\)     (G1)

b.

\(62.5 – 32.5 = 30\)     (M1)(A1)(G2)

Note: Award (M1) for correct quartiles seen.

c.

(A1) for correct label and scale
(A1)(ft) for correct median
(A1)(ft) for correct quartiles and box
(A1) for endpoints at \(17.5\) and \(77.5\) joined to box by straight lines     (A1)(A1)(ft)(A1)(ft)(A1)

Notes: The final (A1) is lost if the lines go through the box. Follow through from their parts (b) and (c).

d.

\(\varepsilon  = \left| {\frac{{43 – 45.85}}{{45.85}}} \right| \times 100\% \)     (M1)

Note: Award (M1) for their correct substitution in \(\% \) error formula.

\( = 6.22\% \) (\(6.21592 \ldots \))     (A1)(ft)(G2)

Notes: Follow through from their answer to part (b)(i). Accept \(6.32\% \) with use of \(45.9\) .

e.
Scroll to Top