IBDP Maths SL 4.4 Linear correlation of bivariate data AA HL Paper 1- Exam Style Questions- New Syllabus
Observations on 12 pairs of values of the random variables X, Y yielded the following results:
Σx = 76.3, Σx² = 563.7, Σy = 72.2, Σy² = 460.1, Σxy = 495.4
(i) Calculate the value of r, the product moment correlation coefficient of the sample.
(ii) Assuming that the distribution of X, Y is bivariate normal with product moment correlation coefficient ρ, calculate the p-value of your result when testing the hypotheses H₀: ρ = 0; H₁: ρ > 0.
(iii) State whether your p-value suggests that X and Y are independent. [7]- Given a further value x = 5.2 from the distribution of X, Y, predict the corresponding value of y. Give your answer to one decimal place. [3]
▶️ Answer/Explanation
The product moment correlation coefficient is calculated as:
\[ r = \frac{\sum xy – n\overline{x}\overline{y}}{\sqrt{(\sum x^2 – n\overline{x}^2)(\sum y^2 – n\overline{y}^2)}} \]
Substituting the given values:
\[ r = \frac{495.4 – 12 \times \frac{76.3}{12} \times \frac{72.2}{12}}{\sqrt{(563.7 – 12 \times (\frac{76.3}{12})^2)(460.1 – 12 \times (\frac{72.2}{12})^2)}} \]
After calculation:
\(\boxed{r \approx 0.809}\)
To test the hypotheses H₀: ρ = 0 vs H₁: ρ > 0, we use the t-statistic:
\[ t = r\sqrt{\frac{n-2}{1-r^2}} = 0.80856\sqrt{\frac{10}{1-0.80856^2}} \approx 4.345 \]
The p-value for this one-tailed test is:
\(\boxed{7.27 \times 10^{-4}}\)
The extremely small p-value (0.000727) provides strong evidence against the null hypothesis of independence.
\(\boxed{\text{The p-value suggests X and Y are not independent}}\)
Using the regression equation:
\[ y – \overline{y} = \frac{\sum xy – n\overline{x}\overline{y}}{\sum x^2 – n\overline{x}^2}(x – \overline{x}) \]
Substituting the given values:
\[ y – \frac{72.2}{12} = \frac{495.4 – 12 \times \frac{76.3}{12} \times \frac{72.2}{12}}{563.7 – 12 \times (\frac{76.3}{12})^2}(x – \frac{76.3}{12}) \]
For x = 5.2:
\[ y = \frac{72.2}{12} + \frac{495.4 – 76.3 \times 6.0167}{563.7 – 485.0808}(5.2 – 6.3583) \]
After calculation:
\(\boxed{y \approx 5.5}\) (to one decimal place)
Jim is investigating the relationship between height (x) and foot length (y) in teenage boys. A sample of 13 boys was taken with these measurements:
Height (cm) | 129 | 135 | 156 | 146 | 155 | 152 | 139 | 166 | 148 | 179 | 157 | 152 | 160 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Foot length (cm) | 25.8 | 25.9 | 29.7 | 28.6 | 29.0 | 29.1 | 25.3 | 29.9 | 26.1 | 30.0 | 27.6 | 27.2 | 28.0 |
Assuming a bivariate normal distribution, answer the following:
- Calculate the product moment correlation coefficient (r)
- Find the p-value for testing H₀: ρ=0 vs H₁: ρ>0
- Interpret the p-value
- Find the regression line equation (y on x)
- Predict foot length for height=170cm
▶️Answer/Explanation
Solution (a): Correlation Coefficient (r)
Using the formula:
\[ r = \frac{n\sum xy – (\sum x)(\sum y)}{\sqrt{[n\sum x^2 – (\sum x)^2][n\sum y^2 – (\sum y)^2]}} \]
Calculations:
\[ \sum x = 1974, \sum y = 362.2 \]
\[ \sum xy = 55012.1, \sum x^2 = 302112, \sum y^2 = 10130.86 \]
\[ r = \frac{13×55012.1 – 1974×362.2}{\sqrt{(13×302112-1974^2)(13×10130.86-362.2^2)}} \]
\[ = \frac{174.5}{\sqrt{30780 × 512.34}} \approx 0.806 \]
Solution (b): p-value Calculation
Test statistic:
\[ t = r\sqrt{\frac{n-2}{1-r^2}} = 0.806\sqrt{\frac{11}{1-0.806^2}} \]
\[ = 0.806 × 5.603 \approx 4.516 \]
Degrees of freedom: df = n-2 = 11
For a one-tailed t-test with t=4.516 and df=11:
p-value = P(T > 4.516) ≈ 0.000438
Solution (c): p-value Interpretation
The extremely small p-value (0.000438) indicates:
- Strong evidence against the null hypothesis (H₀: no correlation)
- Probability of observing r=0.806 by chance alone is ≈0.0438%
- Statistically significant positive correlation at any common α level (0.05, 0.01)
Solution (d): Regression Line
Slope (b):
\[ b = r\frac{s_y}{s_x} = 0.806 × \frac{1.672}{13.08} ≈ 0.103 \]
Intercept (a):
\[ a = \bar{y} – b\bar{x} = 27.86 – 0.103×151.85 ≈ 12.3 \]
Equation:
\[ y = 0.103x + 12.3 \]
Solution (e): Prediction
For x = 170 cm:
\[ y = 0.103×170 + 12.3 = 17.51 + 12.3 = 29.81 \text{ cm} \]
Note: This is slightly outside the observed data range (129-179 cm)