Home / IB Mathematics SL 4.10 Spearman’s rank correlation coefficient AI SL Paper 2 – Exam Style Questions

IB Mathematics SL 4.10 Spearman’s rank correlation coefficient AI SL Paper 2 - Exam Style Questions - New Syllabus

Question

Jordon conducted a study to see if there is a relationship between the price of an apartment, \(y\), and its distance, \(x\), from the city centre of Melbourne.
They took a random sample of six typical apartments along a train line in the city. Jordon obtained the data shown in the following table.
\(x\) (kilometres)7.08.410.312.517.820.9
\(y\) (millions of dollars)2.612.442.031.811.451.18
A plot of these data is seen in the following graph.
Graph of apartment price vs distance from city centre
(a) Write down the value of the Spearman’s rank correlation coefficient, \(r_s\). [1]
(b) (i) Find the Pearson’s product-moment correlation coefficient, \(r\).
     (ii) Use your value of \(r\) to state which two of the following would best describe the correlation between the variables.
         Positive    Negative    Strong    Weak    No correlation [4]
The relationship between the variables can be modelled by the regression equation \(y=ax+b\).
(c) (i) Write down the value of \(a\).
     (ii) Write down the value of \(b\).
     (iii) According to this model, state in context what the value of \(b\) represents. [3]
(d) Jordon uses the regression equation to estimate the price of a typical apartment located 19.6 km from the city centre.
     (i) Find this estimated price.
     (ii) State two reasons that Jordon might use to justify the validity of this estimate. [5]
To verify whether this relationship applies in a different direction from the city centre, Jordon considers two locations, A and B, both an equal distance from the city centre. They take a random sample of seven apartments from each location and record the prices (in millions of dollars) in the following tables.
Apartment price in location A
1.21   1.25   1.31   1.32   1.58   1.95   2.13
Apartment price in location B
1.51   1.58   1.69   2.61   2.72   2.81   2.95
Jordon conducts a \(t\)-test, at the 5% level of significance, to see if the mean apartment price in location A is different to the mean apartment price in location B. They assume the population variances are the same.
For this test, Jordon takes the null hypothesis to be \(\mu_A=\mu_B\).
(e) Write down the alternative hypothesis. [1]
(f) Find the \(p\)-value for this test. [2]
(g) State the conclusion of the test. Justify your answer. [2]
(h) State one additional assumption Jordon has made about the distributions to conduct this test. [1]
▶️ Answer/Explanation
Markscheme (with detailed calculations)

(a) Spearman’s rank correlation

As \(x\) increases, \(y\) strictly decreases. Thus ranks are exactly reversed and
\(\boxed{r_s=-1}\).

(b) Pearson’s PMCC

Data pairs \((x,y)\): (7.0,2.61), (8.4,2.44), (10.3,2.03), (12.5,1.81), (17.8,1.45), (20.9,1.18).
Means: \(\displaystyle \bar x=\frac{7.0+8.4+10.3+12.5+17.8+20.9}{6}=12.8167,\quad \bar y=\frac{2.61+2.44+2.03+1.81+1.45+1.18}{6}=1.9200.\)
Sums: \[ S_{xx}=\sum(x-\bar x)^2=149.9483,\quad S_{yy}=\sum(y-\bar y)^2=1.5392,\quad S_{xy}=\sum(x-\bar x)(y-\bar y)=-14.8760. \] Hence \[ r=\frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}} =\frac{-14.8760}{\sqrt{149.9483\times 1.5392}} \approx \boxed{-0.9792}. \] (ii) Therefore the correlation is \(\boxed{\text{strong}}\) and \(\boxed{\text{negative}}\).

(c) Regression \(y=ax+b\)

\[ a=\frac{S_{xy}}{S_{xx}}=\frac{-14.8760}{149.9483}=\boxed{-0.0992075},\qquad b=\bar y-a\bar x=1.9200-(-0.0992075)(12.8167)=\boxed{3.19151}. \] (iii) \(b\) is the modelled apartment price (in millions of dollars) at distance \(x=0\) km from the city centre.

(d) Estimate at \(x=19.6\) km

(i) \(y=a(19.6)+b=(-0.0992075)(19.6)+3.19151=-1.94448+3.19151=\boxed{1.247\ \text{million dollars}}\) (≈ \(1.25\) to 2 d.p.).
(ii) Justification: (1) This is interpolation (19.6 km lies within 7.0–20.9 km). (2) The linear fit is very strong (\(|r|\approx 0.98\)).

(e)–(h) Two-sample \(t\)-test (equal variances)

Samples (millions of dollars):
A: 1.21, 1.25, 1.31, 1.32, 1.58, 1.95, 2.13 (n=7)
B: 1.51, 1.58, 1.69, 2.61, 2.72, 2.81, 2.95 (n=7)

Sample means: \(\bar x_A=\frac{1.21+\cdots+2.13}{7}=1.5357,\quad \bar x_B=\frac{1.51+\cdots+2.95}{7}=2.2671.\)
Sample variances (with \(n-1\) in denominator): \(s_A^2=0.13533,\ s_B^2=0.41036.\)
Pooled variance: \[ s_p^2=\frac{(7-1)s_A^2+(7-1)s_B^2}{7+7-2} =\frac{6(0.13533)+6(0.41036)}{12} =0.272843,\quad s_p=\sqrt{0.272843}=0.522344. \] Test statistic (two-tailed): \[ t=\frac{\bar x_A-\bar x_B}{s_p\sqrt{\frac{1}{7}+\frac{1}{7}}} =\frac{1.5357-2.2671}{0.522344\cdot \sqrt{2/7}} =\frac{-0.7314}{0.27980} \approx \boxed{-2.620}, \] with \(\text{df}=7+7-2=12\).
(e) Alternative hypothesis: \(\boxed{\mu_A\ne \mu_B}\).
(f) Two-tailed \(p\)-value for \(|t|=2.620\) with \(12\) d.f. \(\approx \boxed{0.022}\).
(g) Since \(p\approx 0.022<0.05\): reject \(H_0\). There is sufficient evidence that the mean prices differ.
(h) Additional assumption: the price distributions in both locations are (approximately) normal (and we have assumed equal variances).
Total Marks: 19
Scroll to Top