Home / IB Mathematics SL 4.2 Presentation of data AI SL Paper 2 – Exam Style Questions

IB Mathematics SL 4.2 Presentation of data AI SL Paper 2 - Exam Style Questions - New Syllabus

Question

As part of their mathematics exploration about classic books, Emma investigated the time taken by students in their school to read the book The Old Man and the Sea. They collected their data by stopping and asking students in the school corridor, until they reached their target of 10 students from each of the literature classes in their school.
(a) State which of the two sampling methods, systematic or quota, Emma has used. [1]
Emma constructed the following box and whisker diagram to show the number of hours students in the sample took to read this book.
(b) Write down the median time to read the book. [1]
(c) Determine the interquartile range. [2]
Liam, a member of the sample, took 25 hours to read the novel. Emma believes Liam’s time is not an outlier.
(d) Determine whether Emma is correct. Support your reasoning. [4]
For each student interviewed, Emma recorded the time taken to read the book (\(x\), in hours) and paired this with their percentage score on the final exam (\(y\)). These data are represented on the scatter diagram below.
(e) Describe the correlation. [1]
Emma correctly calculates the regression line of \(y\) on \(x\) for these students to be \(y = -1.54x + 98.8\). They use this equation to estimate the exam percentage for a student who read the book in \(1.5\) hours.
(f) Obtain the percentage score calculated by Emma. [2]
(g) State whether it is valid to use the regression line for Emma’s estimate. Give a reason. [2]
Emma found a website that rated the “top 50” classic books. They randomly chose eight of these and recorded the number of pages. Example: Book H is rated 44th and has 281 pages. These data are shown in the table.
BookABCDEFGH
Number of pages (\(n\))42158635851225366209624281
Top 50 rating (\(t\))125713224044
(h) Copy and complete the information in the following table. [2]
BookABCDEFGH
Rank – Number of pages        
Rank – Top 50 Rating        
(i) (i) Determine the value of Spearman’s rank correlation coefficient \(r_s\).
     (ii) Interpret your result. [3]
▶️ Answer/Explanation
Markscheme

(a) Sampling method: Quota sampling. A1

(b) Median from boxplot: \(\boxed{10\text{ hours}}\). A1

(c) From the diagram \(Q_1=7,\;Q_3=15\).
\(\text{IQR}=Q_3-Q_1=15-7=\boxed{8\text{ hours}}\). M1 A1

(d) Upper fence \(=Q_3+1.5\times\text{IQR}=15+1.5(8)=27\).
Liam’s time is \(25\text{ h}\) and \(25<27\), so it is not an outlier. Emma is correct. M1 A1 R1 A1

(e) Negative correlation. A1

(f) Substitute \(x=1.5\) into \(y=-1.54x+98.8\):
\(y=-1.54(1.5)+98.8=-2.31+98.8=\boxed{96.49\%\ (\approx 96.5\%)}\). M1 A1

(g) Not valid; \(1.5\) hours is outside the given data range (extrapolation). A1 R1

(h) Ranks (as per exam convention): For “Number of pages”, rank \(1\) = largest number of pages; for “Top 50 rating”, rank \(1\) = best (smallest \(t\)).

BookABCDEFGH
Rank – Number of pages13526847
Rank – Top 50 Rating12345678
A1 A1

(i) Spearman’s rank correlation coefficient. Let \(d=\text{(rank pages)}-\text{(rank rating)}\) and \(n=8\).

BookRank(pages)Rank(rating)\(d\)\(d^2\)
A1100
B3211
C5324
D24-24
E6511
F8624
G47-39
H78-11
\(\sum d^2\) 24
Use \(r_s = 1 – \dfrac{6\sum d^2}{n(n^2-1)}\) with \(n=8\):
\[ r_s \;=\; 1 – \frac{6(24)}{8(8^2-1)} \;=\; 1 – \frac{144}{8\cdot63} \;=\; 1 – \frac{144}{504} \;=\; 1 – \frac{2}{7} \;=\; \boxed{\frac{5}{7}\ \approx\ 0.714}. \] M1 A1
Interpretation: \(r_s\approx 0.714\) indicates a strong positive association between the ranks: books with more pages (better/lower page-rank number) tend to have a better Top-50 rank (lower \(t\)). R1
Total Marks: 18
Scroll to Top