IB Mathematics SL 4.10 Spearman’s rank correlation coefficient AI SL Paper 2 - Exam Style Questions - New Syllabus
Question
Jordon conducted a study to see if there is a relationship between the price of an apartment, \(y\), and its distance, \(x\), from the city centre of Melbourne.
They took a random sample of six typical apartments along a train line in the city. Jordon obtained the data shown in the following table.
\(x\) (kilometres) | 7.0 | 8.4 | 10.3 | 12.5 | 17.8 | 20.9 |
---|---|---|---|---|---|---|
\(y\) (millions of dollars) | 2.61 | 2.44 | 2.03 | 1.81 | 1.45 | 1.18 |
A plot of these data is seen in the following graph.


(a) Write down the value of the Spearman’s rank correlation coefficient, \(r_s\). [1]
(b) (i) Find the Pearson’s product-moment correlation coefficient, \(r\).
(ii) Use your value of \(r\) to state which two of the following would best describe the correlation between the variables.
Positive Negative Strong Weak No correlation [4]
(ii) Use your value of \(r\) to state which two of the following would best describe the correlation between the variables.
Positive Negative Strong Weak No correlation [4]
The relationship between the variables can be modelled by the regression equation \(y=ax+b\).
(c) (i) Write down the value of \(a\).
(ii) Write down the value of \(b\).
(iii) According to this model, state in context what the value of \(b\) represents. [3]
(ii) Write down the value of \(b\).
(iii) According to this model, state in context what the value of \(b\) represents. [3]
(d) Jordon uses the regression equation to estimate the price of a typical apartment located 19.6 km from the city centre.
(i) Find this estimated price.
(ii) State two reasons that Jordon might use to justify the validity of this estimate. [5]
(i) Find this estimated price.
(ii) State two reasons that Jordon might use to justify the validity of this estimate. [5]
To verify whether this relationship applies in a different direction from the city centre, Jordon considers two locations, A and B, both an equal distance from the city centre. They take a random sample of seven apartments from each location and record the prices (in millions of dollars) in the following tables.
Apartment price in location A
1.21 1.25 1.31 1.32 1.58 1.95 2.13
Apartment price in location B
1.51 1.58 1.69 2.61 2.72 2.81 2.95
Jordon conducts a \(t\)-test, at the 5% level of significance, to see if the mean apartment price in location A is different to the mean apartment price in location B. They assume the population variances are the same.
For this test, Jordon takes the null hypothesis to be \(\mu_A=\mu_B\).
For this test, Jordon takes the null hypothesis to be \(\mu_A=\mu_B\).
(e) Write down the alternative hypothesis. [1]
(f) Find the \(p\)-value for this test. [2]
(g) State the conclusion of the test. Justify your answer. [2]
(h) State one additional assumption Jordon has made about the distributions to conduct this test. [1]
▶️ Answer/Explanation
Markscheme (with detailed calculations)
(a) Spearman’s rank correlation
As \(x\) increases, \(y\) strictly decreases. Thus ranks are exactly reversed and
\(\boxed{r_s=-1}\).
\(\boxed{r_s=-1}\).
(b) Pearson’s PMCC
Data pairs \((x,y)\): (7.0,2.61), (8.4,2.44), (10.3,2.03), (12.5,1.81), (17.8,1.45), (20.9,1.18).
Means: \(\displaystyle \bar x=\frac{7.0+8.4+10.3+12.5+17.8+20.9}{6}=12.8167,\quad \bar y=\frac{2.61+2.44+2.03+1.81+1.45+1.18}{6}=1.9200.\)
Sums: \[ S_{xx}=\sum(x-\bar x)^2=149.9483,\quad S_{yy}=\sum(y-\bar y)^2=1.5392,\quad S_{xy}=\sum(x-\bar x)(y-\bar y)=-14.8760. \] Hence \[ r=\frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}} =\frac{-14.8760}{\sqrt{149.9483\times 1.5392}} \approx \boxed{-0.9792}. \] (ii) Therefore the correlation is \(\boxed{\text{strong}}\) and \(\boxed{\text{negative}}\).
Means: \(\displaystyle \bar x=\frac{7.0+8.4+10.3+12.5+17.8+20.9}{6}=12.8167,\quad \bar y=\frac{2.61+2.44+2.03+1.81+1.45+1.18}{6}=1.9200.\)
Sums: \[ S_{xx}=\sum(x-\bar x)^2=149.9483,\quad S_{yy}=\sum(y-\bar y)^2=1.5392,\quad S_{xy}=\sum(x-\bar x)(y-\bar y)=-14.8760. \] Hence \[ r=\frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}} =\frac{-14.8760}{\sqrt{149.9483\times 1.5392}} \approx \boxed{-0.9792}. \] (ii) Therefore the correlation is \(\boxed{\text{strong}}\) and \(\boxed{\text{negative}}\).
(c) Regression \(y=ax+b\)
\[ a=\frac{S_{xy}}{S_{xx}}=\frac{-14.8760}{149.9483}=\boxed{-0.0992075},\qquad b=\bar y-a\bar x=1.9200-(-0.0992075)(12.8167)=\boxed{3.19151}. \] (iii) \(b\) is the modelled apartment price (in millions of dollars) at distance \(x=0\) km from the city centre.
(d) Estimate at \(x=19.6\) km
(i) \(y=a(19.6)+b=(-0.0992075)(19.6)+3.19151=-1.94448+3.19151=\boxed{1.247\ \text{million dollars}}\) (≈ \(1.25\) to 2 d.p.).
(ii) Justification: (1) This is interpolation (19.6 km lies within 7.0–20.9 km). (2) The linear fit is very strong (\(|r|\approx 0.98\)).
(ii) Justification: (1) This is interpolation (19.6 km lies within 7.0–20.9 km). (2) The linear fit is very strong (\(|r|\approx 0.98\)).
(e)–(h) Two-sample \(t\)-test (equal variances)
Samples (millions of dollars):
A: 1.21, 1.25, 1.31, 1.32, 1.58, 1.95, 2.13 (n=7)
B: 1.51, 1.58, 1.69, 2.61, 2.72, 2.81, 2.95 (n=7)
Sample means: \(\bar x_A=\frac{1.21+\cdots+2.13}{7}=1.5357,\quad \bar x_B=\frac{1.51+\cdots+2.95}{7}=2.2671.\)
Sample variances (with \(n-1\) in denominator): \(s_A^2=0.13533,\ s_B^2=0.41036.\)
Pooled variance: \[ s_p^2=\frac{(7-1)s_A^2+(7-1)s_B^2}{7+7-2} =\frac{6(0.13533)+6(0.41036)}{12} =0.272843,\quad s_p=\sqrt{0.272843}=0.522344. \] Test statistic (two-tailed): \[ t=\frac{\bar x_A-\bar x_B}{s_p\sqrt{\frac{1}{7}+\frac{1}{7}}} =\frac{1.5357-2.2671}{0.522344\cdot \sqrt{2/7}} =\frac{-0.7314}{0.27980} \approx \boxed{-2.620}, \] with \(\text{df}=7+7-2=12\).
(e) Alternative hypothesis: \(\boxed{\mu_A\ne \mu_B}\).
(f) Two-tailed \(p\)-value for \(|t|=2.620\) with \(12\) d.f. \(\approx \boxed{0.022}\).
(g) Since \(p\approx 0.022<0.05\): reject \(H_0\). There is sufficient evidence that the mean prices differ.
(h) Additional assumption: the price distributions in both locations are (approximately) normal (and we have assumed equal variances).
(f) Two-tailed \(p\)-value for \(|t|=2.620\) with \(12\) d.f. \(\approx \boxed{0.022}\).
(g) Since \(p\approx 0.022<0.05\): reject \(H_0\). There is sufficient evidence that the mean prices differ.
(h) Additional assumption: the price distributions in both locations are (approximately) normal (and we have assumed equal variances).
Total Marks: 19