IB Mathematics AI SL Spearman's Rank MAI Study Notes - New Syllabus

IB Mathematics AI SL Spearman’s Rank MAI Study Notes

LEARNING OBJECTIVE

Spearman’s rank correlation coefficient, r_s

Key Concepts:

Spearman’s rank correlation coefficient, r_s

Spearman’s Rank Correlation Coefficient (rₛ)

Spearman’s rank correlation coefficient (denoted as $r_s$) is a non-parametric measure of correlation that assesses how well the relationship between two variables can be described using a monotonic function (i.e., as one variable increases, the other tends to increase or decrease).

$r_s = 1 – \frac{6 \sum d_i^2}{n(n^2 – 1)}$

$d_i$ = difference between the ranks of corresponding variables
$n$ = number of data pairs

Interpreting Spearman’s rₛ

The value of Spearman’s rank correlation coefficient ranges from $-1$ to $+1,$ and its interpretation is based on the strength and direction of a monotonic relationship:

Notice: Also consider p-values and scatterplots to confirm significance.

Example

The table below shows the rankings of 6 students in two subjects: Math and English.

Student	Math Rank (X)	English Rank (Y)	d = X – Y	d²
A	1	2	-1	1
B	2	1	1	1
C	3	4	-1	1
D	4	3	1	1
E	5	5	0	0
F	6	6	0	0

▶️ Answer/Explanation

Solution:

Calculate the sum of squared differences:
$ \sum d^2 = 1 + 1 + 1 + 1 + 0 + 0 = 4 $

Number of data pairs: $ n = 6 $

Spearman’s rank correlation coefficient:
$ r_s = 1 – \frac{6 \sum d^2}{n(n^2 – 1)} = 1 – \frac{6 \times 4}{6(36 – 1)} = 1 – \frac{24}{210} = 1 – 0.1143 = \rm{0.8857} $

Interpretation: Since $ r_s \approx 0.89 $, there is a strong positive correlation between the students’ rankings in Math and English.

GDC Example – Spearman’s Rank Correlation Coefficient

Data: The table below shows two variables, X and Y, representing the performance of 7 students in two different subjects.

Student	X (Math Score)	Y (Physics Score)
A	85	82
B	78	75
C	92	88
D	70	74
E	88	90
F	76	70
G	80	78

▶️ Answer/Explanation

1. Press STAT → Edit
2. Enter X values into L1, Y values into L2
3. Use a ranking tool or sort to create ranked lists (L3 = rank(L1), L4 = rank(L2)) — on Casio, use the Statistics mode to rank data directly.
4. Calculate d = L3 − L4 and then d² in another list.
5. Use the formula:
$ r_s = 1 – \dfrac{6 \sum d^2}{n(n^2 – 1)} $

On TI-Nspire or Casio ClassPad, there is a built-in Spearman rank correlation feature:
MENU → Statistics → Regression → Spearman Rank
Choose your two lists and press OK.

Result on GDC:
$ r_s ≈ 0.964 $ → strong positive correlation between math and physics scores.

Interpretation: There is a strong positive monotonic relationship between student performance in math and physics. As math scores increase, physics scores tend to increase as well.

Handling Tied Ranks

Tied Ranks Problem:

When two or more values are the same, assigning a unique rank to each becomes problematic.

Assign the average rank to all tied values.

Example

Data: $x = [10, 10, 30]$

Ranks before tie handling: 1, 2, 3

Since two 10s are tied at rank positions 1 and 2: What is the final ranks.

▶️ Answer/Explanation

$\text{Average Rank} = \frac{1 + 2}{2} = 1.5 $

Final ranks: $[1.5, 1.5, 3]$

Adjustment in Formula:

If many ties occur, especially in large datasets, a correction factor or a more complex version of the formula may be used:

$
r_s = \frac{\text{Cov}(R_X, R_Y)}{\sigma_{R_X} \cdot \sigma_{R_Y}}
$

Where $R_X, R_Y$ are the ranks of the variables.

Comparison with Pearson’s Correlation Coefficient

Feature	Pearson’s r	Spearman’s r_s
Type of Relationship	Linear	Monotonic (not necessarily linear)
Data Type	Interval/Ratio	Ordinal, Interval, Ratio
Outlier Sensitivity	High	Low
Assumes Normality	Yes	No
Rank-based	No	Yes

Example:

If $x = [1, 2, 3, 4, 100]$ and $y = [1, 2, 3, 4, 5]$, Pearson’s r may be distorted by the extreme value (100), but Spearman’s rₛ will still capture the monotonic trend.

Example

Data: The following data shows the time spent studying (in hours) and test scores of 6 students.

Student	Hours Studied (X)	Test Score (Y)
A	1	52
B	2	60
C	3	70
D	4	65
E	5	80
F	6	85

▶️ Answer/Explanation

Pearson’s Correlation Coefficient (r)

Using GDC or spreadsheet software:
Pearson’s r ≈ 0.976
This indicates a strong positive linear correlation.

Spearman’s Rank Correlation Coefficient (rₛ)

Convert data to ranks:

Hours Studied: already in order → ranks 1 to 6
Test Scores: ranks → 1, 2, 4, 3, 5, 6

Calculate $ d_i $, $ d_i^2 $, then apply:
$ r_s = 1 – \frac{6 \sum d_i^2}{n(n^2 – 1)} = 1 – \frac{6(1^2 + 0^2 + 0^2 + 1^2 + 0^2 + 0^2)}{6(36 – 1)} = 1 – \frac{12}{210} = 0.943 $
Spearman’s rₛ ≈ 0.943

Conclusion:
Pearson’s r is higher because the data is almost perfectly linear.
Spearman’s rₛ is slightly lower due to the minor rank changes caused by the score for Student D.
Both suggest a strong positive relationship, but Pearson measures linearity and Spearman measures monotonicity.

Outlier

An outlier is a single data point that goes far outside the average value of a group of statistics. Outliers may be exceptions that stand outside individual samples of populations as well. In a more general context, an outlier is an individual that is markedly different from the norm in some respect.

Causes:

Measurement error
Natural variation
Data entry mistakes

Example:

Dataset: $5, 6, 7, 8, 100$→ 100 is an outlier

Effect of Outliers

Method	Effect of Outliers
Pearson’s r	Strongly affected
Spearman’s rₛ	Less affected (uses ranks)

Outliers change the magnitude of Pearson’s r by pulling the regression line toward themselves.
Spearman’s rₛ, using ranks, ignores raw values and only considers ordering, making it more robust.

Appropriateness and Limitations

Use Spearman’s rₛ when:	Limitations:
Data is ordinal Relationship is monotonic but non-linear Dataset includes outliers Sample size is small	Cannot detect non-monotonic relationships Less sensitive to fine structure than Pearson’s r Rank transformation can lose information

Choosing the Right Correlation Method

Situation	Use
Linear relationship, no outliers	Pearson’s r
Monotonic, non-linear	Spearman’s rₛ
Ordinal data	Spearman’s rₛ
Outliers present	Spearman’s rₛ preferred
Non-monotonic	Neither (consider other methods)

Previous Topic Notes

Next Topic Notes

IB Mathematics AI SL Spearman's Rank MAI Study Notes - New Syllabus

IB Mathematics AI SL Spearman’s Rank MAI Study Notes

SPEARMAN’S RANK CORRELATION COEFFICIENT

HANDLING TIED RANKS

OUTLIER

Resources

Members

Company