Home / IB DP Maths 2026, 2027 & 2028 / Application and Interpretation HL / IB Mathematics AI SL Spearman’s Rank MAI Study Notes

IB Mathematics AI SL Spearman's Rank MAI Study Notes - New Syllabus

IB Mathematics AI SL Spearman’s Rank MAI Study Notes

LEARNING OBJECTIVE

  • Spearman’s rank correlation coefficient, rs

Key Concepts: 

  • Spearman’s rank correlation coefficient, rs

MAI HL and SL Notes – All topics

SPEARMAN’S RANK CORRELATION COEFFICIENT

Spearman’s Rank Correlation Coefficient (rₛ)

Spearman’s rank correlation coefficient (denoted as $r_s$) is a non-parametric measure of correlation that assesses how well the relationship between two variables can be described using a monotonic function (i.e., as one variable increases, the other tends to increase or decrease).

$r_s = 1 – \frac{6 \sum d_i^2}{n(n^2 – 1)}$

$d_i$ = difference between the ranks of corresponding variables
$n$ = number of data pairs

Interpreting Spearman’s rₛ

The value of Spearman’s rank correlation coefficient ranges from $-1$ to $+1,$ and its interpretation is based on the strength and direction of a monotonic relationship:

 

Notice: Also consider p-values and scatterplots to confirm significance.

Example

The table below shows the rankings of 6 students in two subjects: Math and English.
StudentMath Rank (X)English Rank (Y)d = X – Y
A12-11
B2111
C34-11
D4311
E5500
F6600
▶️ Answer/Explanation
Solution:

Calculate the sum of squared differences:
\( \sum d^2 = 1 + 1 + 1 + 1 + 0 + 0 = 4 \)

Number of data pairs: \( n = 6 \)

Spearman’s rank correlation coefficient:
\( r_s = 1 – \frac{6 \sum d^2}{n(n^2 – 1)} = 1 – \frac{6 \times 4}{6(36 – 1)} = 1 – \frac{24}{210} = 1 – 0.1143 = \rm{0.8857} \)

Interpretation: Since \( r_s \approx 0.89 \), there is a strong positive correlation between the students’ rankings in Math and English.
GDC Example – Spearman’s Rank Correlation Coefficient

Data: The table below shows two variables, X and Y, representing the performance of 7 students in two different subjects.
StudentX (Math Score)Y (Physics Score)
A8582
B7875
C9288
D7074
E8890
F7670
G8078
▶️ Answer/Explanation
1. Press STAT → Edit
2. Enter X values into L1, Y values into L2
3. Use a ranking tool or sort to create ranked lists (L3 = rank(L1), L4 = rank(L2)) — on Casio, use the Statistics mode to rank data directly.
4. Calculate d = L3 − L4 and then d² in another list.
5. Use the formula:
    \( r_s = 1 – \dfrac{6 \sum d^2}{n(n^2 – 1)} \)

 On TI-Nspire or Casio ClassPad, there is a built-in Spearman rank correlation feature:
MENU → Statistics → Regression → Spearman Rank
Choose your two lists and press OK.

Result on GDC:
\( r_s ≈ 0.964 \) → strong positive correlation between math and physics scores.
Interpretation: There is a strong positive monotonic relationship between student performance in math and physics. As math scores increase, physics scores tend to increase as well.

HANDLING TIED RANKS

Handling Tied Ranks

Tied Ranks Problem:

When two or more values are the same, assigning a unique rank to each becomes problematic.

Assign the average rank to all tied values.

Example

Data: \(x = [10, 10, 30]\)

Ranks before tie handling: 1, 2, 3

Since two 10s are tied at rank positions 1 and 2: What is the final ranks.

▶️ Answer/Explanation

$\text{Average Rank} = \frac{1 + 2}{2} = 1.5 $

Final ranks: \([1.5, 1.5, 3]\)

Adjustment in Formula:

If many ties occur, especially in large datasets, a correction factor or a more complex version of the formula may be used:

$
r_s = \frac{\text{Cov}(R_X, R_Y)}{\sigma_{R_X} \cdot \sigma_{R_Y}}
$

Where $R_X, R_Y$ are the ranks of the variables.

Comparison with Pearson’s Correlation Coefficient

FeaturePearson’s rSpearman’s rs
Type of RelationshipLinearMonotonic (not necessarily linear)
Data TypeInterval/RatioOrdinal, Interval, Ratio
Outlier SensitivityHighLow
Assumes NormalityYesNo
Rank-basedNoYes

Example:

If $x = [1, 2, 3, 4, 100]$ and $y = [1, 2, 3, 4, 5]$, Pearson’s r may be distorted by the extreme value (100), but Spearman’s rₛ will still capture the monotonic trend.

Example

Data: The following data shows the time spent studying (in hours) and test scores of 6 students.
StudentHours Studied (X)Test Score (Y)
A152
B260
C370
D465
E580
F685
▶️ Answer/Explanation
 Pearson’s Correlation Coefficient (r)

Using GDC or spreadsheet software:
Pearson’s r ≈ 0.976
This indicates a strong positive linear correlation.

Spearman’s Rank Correlation Coefficient (rₛ)

Convert data to ranks:
  • Hours Studied: already in order → ranks 1 to 6
  • Test Scores: ranks → 1, 2, 4, 3, 5, 6
Calculate \( d_i \), \( d_i^2 \), then apply:
$ r_s = 1 – \frac{6 \sum d_i^2}{n(n^2 – 1)} = 1 – \frac{6(1^2 + 0^2 + 0^2 + 1^2 + 0^2 + 0^2)}{6(36 – 1)} = 1 – \frac{12}{210} = 0.943 $
Spearman’s rₛ ≈ 0.943

Conclusion:
 Pearson’s r is higher because the data is almost perfectly linear.
 Spearman’s rₛ is slightly lower due to the minor rank changes caused by the score for Student D.
 Both suggest a strong positive relationship, but Pearson measures linearity and Spearman measures monotonicity.

OUTLIER

Outlier

An outlier is a single data point that goes far outside the average value of a group of statistics. Outliers may be exceptions that stand outside individual samples of populations as well. In a more general context, an outlier is an individual that is markedly different from the norm in some respect.

Causes:

Measurement error
Natural variation
Data entry mistakes

Example:

Dataset: $5, 6, 7, 8, 100$→ 100 is an outlier

Effect of Outliers

Method Effect of Outliers
Pearson’s rStrongly affected
Spearman’s rₛLess affected (uses ranks)

Outliers change the magnitude of Pearson’s r by pulling the regression line toward themselves.
Spearman’s rₛ, using ranks, ignores raw values and only considers ordering, making it more robust.

Appropriateness and Limitations

Use Spearman’s rₛ when:Limitations:
  • Data is ordinal
  • Relationship is monotonic but non-linear
  • Dataset includes outliers
  • Sample size is small
  • Cannot detect non-monotonic relationships
  • Less sensitive to fine structure than Pearson’s r
  • Rank transformation can lose information

Choosing the Right Correlation Method

Situation Use
Linear relationship, no outliersPearson’s r
Monotonic, non-linearSpearman’s rₛ
Ordinal data Spearman’s rₛ
Outliers present Spearman’s rₛ preferred
Non-monotonicNeither (consider other methods)
Scroll to Top