IB Mathematics SL 4.4 Linear correlation of bivariate data AI SL Paper 2 - Exam Style Questions - New Syllabus
Question
The mean annual temperatures for Earth, recorded at fifty-year intervals, are listed in the table.
Year \((x)\) | 1708 | 1758 | 1808 | 1858 | 1908 | 1958 | 2008 |
---|---|---|---|---|---|---|---|
Temperature °C \((y)\) | 8.73 | 9.22 | 9.10 | 9.12 | 9.13 | 9.45 | 9.76 |
Emily builds a linear model for this data by finding the equation of the straight line passing through the points with coordinates \((1708,\,8.73)\) and \((1958,\,9.45)\).
(a) Calculate the gradient of the straight line that passes through these two points. [2]
(b) (i) Interpret the meaning of the gradient in the context of the question.
(ii) State appropriate units for the gradient. [2]
(ii) State appropriate units for the gradient. [2]
(c) Find the equation of this line giving your answer in the form \(y=mx+c\). [2]
(d) Use Emily’s model to estimate the mean annual temperature in the year \(2000\). [2]
Daniel uses linear regression to obtain a model for the data.
(e) (i) Find the equation of the regression line \(y\) on \(x\).
(ii) Find the value of \(r\), the Pearson’s product-moment correlation coefficient. [3]
(ii) Find the value of \(r\), the Pearson’s product-moment correlation coefficient. [3]
(f) Use Daniel’s model to estimate the mean annual temperature in the year \(2000\). [2]
Daniel uses his regression line to predict the year when the mean annual temperature will first exceed \(15^\circ\text{C}\).
(g) State two reasons why Daniel’s prediction may not be valid. [2]
▶️ Answer/Explanation
Markscheme (with detailed working)
(a)
Gradient through \((x_1,y_1)=(1708,8.73)\) and \((x_2,y_2)=(1958,9.45)\): \[ m=\frac{9.45-8.73}{1958-1708}=\frac{0.72}{250}=0.00288. \] M1 A1
(b)
(i) The gradient is the estimated mean yearly increase in Earth’s mean annual temperature. A1
(ii) Units: \(^{\circ}\text{C per year}\). A1
(ii) Units: \(^{\circ}\text{C per year}\). A1
(c)
Using \(y=mx+c\) with \(m=0.00288\) and point \((1708,8.73)\): \[ 8.73=0.00288(1708)+c \Rightarrow c=8.73-4.91904=3.81096\ldots \] Hence \(\boxed{y=0.00288\,x+3.81096\,(~\approx~0.00288x+3.81~)}\). M1 A1
(d)
Substitute \(x=2000\) into Emily’s line: \[ y=0.00288(2000)+3.81096=5.76+3.81096=9.57096\ldots\approx \boxed{9.57^\circ\text{C}}. \] M1 A1
(e)
Using technology (linear regression \(y\) on \(x\)):
(i) \( \displaystyle \boxed{y=0.00255714\ldots\,x+4.46454\ldots\ \ (\text{accept }y\approx 0.00256x+4.46)} \). (M1) A1
(ii) \( \displaystyle \boxed{r\approx 0.861\ (0.861333\ldots)} \). A1
(ii) \( \displaystyle \boxed{r\approx 0.861\ (0.861333\ldots)} \). A1
(f)
Estimate at \(x=2000\) using Daniel’s regression line: \[ y=0.00255714\ldots(2000)+4.46454\ldots = 5.11428\ldots + 4.46454\ldots =9.57882\ldots \approx \boxed{9.58^\circ\text{C}}. \] (Accept \(9.57^\circ\text{C}\) if \(4.46\) used.) M1 A1
(g)
Any two valid reasons, for example:
- The regression is \(y\) on \(x\); using it to predict a year from a temperature (i.e. \(x\) from \(y\)) is not generally reliable.
- Extrapolation far beyond the observed data range may be invalid (the relationship need not remain linear).
- Small dataset and other influencing factors (model limitations, correlation \(\ne\) causation).
A1 A1
Total Marks: 15