IB Mathematics SL 4.4 Linear correlation of bivariate data AI SL Paper 1- Exam Style Questions- New Syllabus
Question
A researcher believes that there is a linear relationship between the age of a male runner and the time it takes them to run 5000 meters.
To test this, they recorded the age, x years, and the time, t minutes, for eight males in a single 5000m race. The results are presented in the following table and scatter diagram.
\( x, \text{years} \) | 18 | 24 | 28 | 36 | 40 | 46 | 52 | 62 |
---|---|---|---|---|---|---|---|---|
\( t, \text{minutes} \) | 29.4 | 29.2 | 31.1 | 33.6 | 32.2 | 33.1 | 35.2 | 40.4 |
A scatter plot of the data with age (\( x \), years) on the x-axis and time (\( t \), minutes) on the y-axis.
(a) Determine the value of the Pearson’s product-moment correlation coefficient, r. [2]
The researcher found the following information about r appropriate for athletic performance.
Value of \( |r| \) | Description of the correlation |
---|---|
\( 0 \leq |r| < 0.4 \) | weak |
\( 0.4 \leq |r| < 0.8 \) | moderate |
\( 0.8 \leq |r| \leq 1 \) | strong |
(b) Comment on your answer to part (a), using the information that the researcher found. [1]
(c) Write down the equation of the regression line of t on x, in the form \[ t = ax + b. \] [1]
(d) Use the regression line to estimate the time it takes a 57-year-old male to run 5000 meters. [2]
▶️ Answer/Explanation
Markscheme
(a)
Calculate the Pearson’s product-moment correlation coefficient using the formula:
\( r = \frac{n \sum xy – \sum x \sum y}{\sqrt{[n \sum x^2 – (\sum x)^2][n \sum y^2 – (\sum y)^2]}} \), where \( y \) represents t.
Data: \( x = [18, 24, 28, 36, 40, 46, 52, 62] \), \( t = [29.4, 29.2, 31.1, 33.6, 32.2, 33.1, 35.2, 40.4] \), \( n = 8 \).
Compute sums:
\( \sum x = 18 + 24 + 28 + 36 + 40 + 46 + 52 + 62 = 306 \),
\( \sum t = 29.4 + 29.2 + 31.1 + 33.6 + 32.2 + 33.1 + 35.2 + 40.4 = 264.2 \),
\( \sum xy = (18 \times 29.4) + (24 \times 29.2) + (28 \times 31.1) + (36 \times 33.6) + (40 \times 32.2) + (46 \times 33.1) + (52 \times 35.2)\)
Calculate the Pearson’s product-moment correlation coefficient using the formula:
\( r = \frac{n \sum xy – \sum x \sum y}{\sqrt{[n \sum x^2 – (\sum x)^2][n \sum y^2 – (\sum y)^2]}} \), where \( y \) represents t.
Data: \( x = [18, 24, 28, 36, 40, 46, 52, 62] \), \( t = [29.4, 29.2, 31.1, 33.6, 32.2, 33.1, 35.2, 40.4] \), \( n = 8 \).
Compute sums:
\( \sum x = 18 + 24 + 28 + 36 + 40 + 46 + 52 + 62 = 306 \),
\( \sum t = 29.4 + 29.2 + 31.1 + 33.6 + 32.2 + 33.1 + 35.2 + 40.4 = 264.2 \),
\( \sum xy = (18 \times 29.4) + (24 \times 29.2) + (28 \times 31.1) + (36 \times 33.6) + (40 \times 32.2) + (46 \times 33.1) + (52 \times 35.2)\)
\(+ (62 \times 40.4) = 529.2 + 700.8 + 870.8 + 1209.6 + 1288 + 1522.6 + 1830.4 + 2504.8 = 10826.2 \),
\( \sum x^2 = 18^2 + 24^2 + 28^2 + 36^2 + 40^2 + 46^2 + 52^2 + 62^2 = 324 + 576 + 784 + 1296 + 1600 + 2116 + 2704 + 3844 = 13644 \),
\( \sum t^2 = 29.4^2 + 29.2^2 + 31.1^2 + 33.6^2 + 32.2^2 + 33.1^2 + 35.2^2 + 40.4^2 = 864.36 + 852.64 + 967.21 + 1128.96 + 1036.84 + 1095.61\)
\( \sum x^2 = 18^2 + 24^2 + 28^2 + 36^2 + 40^2 + 46^2 + 52^2 + 62^2 = 324 + 576 + 784 + 1296 + 1600 + 2116 + 2704 + 3844 = 13644 \),
\( \sum t^2 = 29.4^2 + 29.2^2 + 31.1^2 + 33.6^2 + 32.2^2 + 33.1^2 + 35.2^2 + 40.4^2 = 864.36 + 852.64 + 967.21 + 1128.96 + 1036.84 + 1095.61\)
\(+ 1239.04 + 1632.16 = 8816.82 \).
Numerator: \( 8 \times 10826.2 – 306 \times 264.2 = 86609.6 – 80845.2 = 5764.4 \).
Denominator: \( \sqrt{[8 \times 13644 – 306^2][8 \times 8816.82 – 264.2^2]} = \sqrt{[109152 – 93636][70534.56 – 69801.64]} = \sqrt{15516 \times 732.92} \approx \sqrt{11377635.6192} \approx 3372.529 \).
Thus: \( r = \frac{5764.4}{3372.529} \approx 0.933419 \).
Rounded: \( r \approx 0.933 \). A2
[2 marks]
Numerator: \( 8 \times 10826.2 – 306 \times 264.2 = 86609.6 – 80845.2 = 5764.4 \).
Denominator: \( \sqrt{[8 \times 13644 – 306^2][8 \times 8816.82 – 264.2^2]} = \sqrt{[109152 – 93636][70534.56 – 69801.64]} = \sqrt{15516 \times 732.92} \approx \sqrt{11377635.6192} \approx 3372.529 \).
Thus: \( r = \frac{5764.4}{3372.529} \approx 0.933419 \).
Rounded: \( r \approx 0.933 \). A2
[2 marks]
(b)
Using the table, since \( |r| = 0.933 \) and \( 0.8 \leq |r| \leq 1 \), the correlation is strong. A1
[1 mark]
Using the table, since \( |r| = 0.933 \) and \( 0.8 \leq |r| \leq 1 \), the correlation is strong. A1
[1 mark]
(c)
The regression line is \( t = ax + b \).
Slope: \( a = \frac{n \sum xy – \sum x \sum y}{n \sum x^2 – (\sum x)^2} = \frac{8 \times 10826.2 – 306 \times 264.2}{8 \times 13644 – 306^2} = \frac{5764.4}{15516} \approx 0.227703 \).
Intercept: \( b = \frac{\sum t – a \sum x}{n} = \frac{264.2 – 0.227703 \times 306}{8} = \frac{264.2 – 69.677118}{8} \approx \frac{194.522882}{8} \approx 24.3153 \).
Rounded to 3 significant figures: \( t = 0.228x + 24.3 \). A1
[1 mark]
The regression line is \( t = ax + b \).
Slope: \( a = \frac{n \sum xy – \sum x \sum y}{n \sum x^2 – (\sum x)^2} = \frac{8 \times 10826.2 – 306 \times 264.2}{8 \times 13644 – 306^2} = \frac{5764.4}{15516} \approx 0.227703 \).
Intercept: \( b = \frac{\sum t – a \sum x}{n} = \frac{264.2 – 0.227703 \times 306}{8} = \frac{264.2 – 69.677118}{8} \approx \frac{194.522882}{8} \approx 24.3153 \).
Rounded to 3 significant figures: \( t = 0.228x + 24.3 \). A1
[1 mark]
(d)
Substitute \( x = 57 \) into the regression line \( t = 0.227703x + 24.3153 \):
\( t = 0.227703 \times 57 + 24.3153 \approx 12.979071 + 24.3153 \approx 37.2944 \).
Rounded: \( t \approx 37.3 \) minutes. M1 A1
[2 marks]
Substitute \( x = 57 \) into the regression line \( t = 0.227703x + 24.3153 \):
\( t = 0.227703 \times 57 + 24.3153 \approx 12.979071 + 24.3153 \approx 37.2944 \).
Rounded: \( t \approx 37.3 \) minutes. M1 A1
[2 marks]
Total Marks: 6