Home / IB Mathematics SL 4.10 Spearman’s rank correlation coefficient AI SL Paper 1- Exam Style Questions

IB Mathematics SL 4.10 Spearman’s rank correlation coefficient AI SL Paper 1- Exam Style Questions- New Syllabus

Question

At a running club, Aarav conducts a study to find if there is any association between an athlete’s age and their best time taken to run \(100\ \text{m}\). Eight athletes are selected at random, and their details are listed below.

VariableABCDEFGH
Age (years)1317221819251136
Time (seconds)13.414.613.412.912.011.817.013.1
Aarav decides to calculate the Spearman’s rank correlation coefficient for his set of data.
(a) Complete the table of ranks. [2]

RankABCDEFGH
Age rank  3     
Time rank      1 
(b) Calculate the Spearman’s rank correlation coefficient, \(r_s\). [2]
(c) Interpret this value of \(r_s\) in the context of the question. [1]
(d) Suggest a mathematical reason why Aarav may have decided not to use Pearson’s product–moment correlation coefficient with his data from the original table. [1]
▶️Answer/Explanation
Markscheme

(a)

Use descending ranks (largest value \(=\) rank \(1\)); average any ties. The completed ranks are:

RankABCDEFGH
Age rank76354281
Time rank3.523.567815
A1 A1

(b)

Compute \(r_s\) as the Pearson correlation of the two rank lists (ties averaged). With \(n=8\), \(\bar R_x=\bar R_y=\dfrac{1+2+\cdots+8}{8}=4.5\).
Using the completed ranks: \[ \sum (R_x-\bar R_x)^2=42,\quad \sum (R_y-\bar R_y)^2=41.5,\quad \sum (R_x-\bar R_x)(R_y-\bar R_y)=-28. \] Hence \[ r_s \;=\; \frac{\sum (R_x-\bar R_x)(R_y-\bar R_y)}{\sqrt{\sum (R_x-\bar R_x)^2\,\sum (R_y-\bar R_y)^2}} \;=\; \frac{-28}{\sqrt{42\times 41.5}} \;=\; -0.670670\ldots \approx \boxed{-0.671}. \] A2

GDC method: enter the two rank lists (with ties averaged) and use the correlation function to obtain \(r_s\approx -0.671\).
Note: The shortcut \(1-\dfrac{6\sum d_i^2}{n(n^2-1)}\) is exact only when there are no ties; here there is a tie at \(13.4\ \text{s}\), so the ranked-Pearson (or technology) method is appropriate.

(c)

\(\boxed{\text{There is a negative correlation between age and best \(100\ \text{m}\) time in this sample.}}\) R1

(d)

A valid reason: the relationship may not be linear / the data need not be bivariate normal / Pearson’s \(r\) is sensitive to outliers / presence of equal values (ties). R1
Total Marks: 6

Question

The decathlon is a contest where athletes compete in ten events. Two of those events are long jump and high jump. In both events, a greater distance implies a better ranking.
The table lists results for these two events at the World Championships.
Athlete’s CountryLong Jump (m)High Jump (m)Long Jump RankHigh Jump Rank
Germany7.642.111 
France7.522.082 
Estonia7.491.843 
Canada7.442.024 
Netherlands7.332.055 
Ukraine7.282.026 
Algeria7.221.907 
Austria7.111.878 
Grenada6.981.999 
Japan6.641.9610 
The Spearman’s rank correlation coefficient is used to assess whether there is a linear correlation between an athlete’s ranking in long jump and their ranking in high jump.
(a) Complete the table to show the athletes’ rankings in high jump. [2]
(b) Find the value of the Spearman’s rank correlation coefficient \( r_s \). [2]
The following guide is used by the trainer to determine the strength of the correlation between the ranks for long jump and high jump.
\(|r_s|\)Strength
0.000 to 0.199Very weak
0.200 to 0.399Weak
0.400 to 0.599Moderate
0.600 to 0.799Strong
0.800 to 1.000Very strong
(c) State the strength of the correlation between the rankings as indicated by the table and interpret this in context. [2]
▶️Answer/Explanation
Markscheme

(a)

Rank the high-jump heights from largest (rank 1) to smallest. Ties share the average rank. The completed table is:

CountryLong Jump (m)High Jump (m)Long Jump RankHigh Jump Rank
Germany7.642.1111
France7.522.0822
Estonia7.491.84310
Canada7.442.0244.5
Netherlands7.332.0553
Ukraine7.282.0264.5
Algeria7.221.9078
Austria7.111.8789
Grenada6.981.9996
Japan6.641.96107

A1 A1

[2 marks]

(b)

Let \(n=10\). Using Spearman’s formula \[ r_s = 1 – \frac{6\sum d_i^2}{n(n^2-1)},\quad d_i=(\text{LJ rank})-(\text{HJ rank}). \] Compute differences and squares (in LJ order):
Germany \(0,0\); France \(0,0\); Estonia \(-7,49\); Canada \(-0.5,0.25\); Netherlands \(2,4\); Ukraine \(1.5,2.25\); Algeria \(-1,1\); Austria \(-1,1\); Grenada \(3,9\); Japan \(3,9\).
Hence \(\sum d_i^2=75.5\) and \[ r_s=1-\frac{6(75.5)}{10(10^2-1)} = 1-\frac{453}{990}\approx 0.542. \] So \(r_s\approx \mathbf{0.541}\) (3 s.f.). A2

[2 marks]

(c)

Since \(|r_s|=0.541\) lies in \(0.400\)–\(0.599\), the correlation is moderate.
Interpretation: athletes who place well in long jump tend to place fairly well in high jump (and vice versa), though the relationship is not extremely strong. A1 A1

[2 marks]

Total Marks: 6
Scroll to Top