IBDP Maths AI: Topic: SL 4.4: Linear correlation of bivariate data: IB style Questions HL Paper 1

Question

Observations on 12 pairs of values of the random variables X , Y yielded the following results.

Σx = 76.3 , Σx 2 = 563.7, Σy = 72.2, Σy 2 = 460.1, Σxy = 495.4

    1. (i) Calculate the value of r , the product moment correlation coefficient of the sample.

      (ii) Assuming that the distribution of X , Y is bivariate normal with product moment correlation coefficient ρ , calculate the p-value of your result when testing the hypotheses H0 : ρ = 0; H1 : ρ > 0.

  1.   (iii) State whether your p-value suggests that X and Y are independent. [7]
  2. b             Given a further value x = 5.2 from from the distribution of X , Y , predict the corresponding value of y . Give your answer to one decimal place. [3]
▶️Answer/Explanation

Ans:

(a)

(i) use of 

(ii)

t = 0.80856… \(\sqrt{\frac{10}{1-0.80856…}}\)

= 4.345…

p-value = 7.27 × 10-4 

(iii) this value indicates that X,Y are not independent

(b)

use of

putting x = 5.2 gives y = 5.5

Question

Jim is investigating the relationship between height and foot length in teenage boys.

A sample of 13 boys is taken and the height and foot length of each boy are measured.

The results are shown in the table.

You may assume that this is a random sample from a bivariate normal distribution.

Jim wishes to determine whether or not there is a positive association between height and foot length.

a.Calculate the product moment correlation coefficient.[2]

b.Find the \(p\)value.[2]

c.Interpret the \(p\)value in the context of the question.[1]

d.Find the equation of the regression line of \(y\) on \(x\).[2]

e.Estimate the foot length of a boy of height 170 cm.[2]

 
▶️Answer/Explanation

Markscheme

Note: In all parts accept answers which round to the correct 2sf answer.

\(r = 0.806\)     A2

a.

\(4.38 \times {10^{ – 4}}\)     A2

b.

\(p\)-value represents strong evidence to indicate a (positive) association between height and foot length     A1

Note: FT the \(p\)-value

c.

\(y = 0.103x + 12.3\)     A2

d.

attempted substitution of \(x = 170\)     (M1)

\(y = 29.7\)     A1

Note: Accept \(y = 29.8\)

e.

Question

Bill is investigating whether or not there is a positive association between the heights and weights of boys of a certain age. He defines the hypotheses\[{{\rm{H}}_0}:\rho  = 0;{{\rm{H}}_1}:\rho  > 0 ,\]where \(\rho \) denotes the population correlation coefficient between heights and weights of boys of this age. He measures the height, \(h\) cm, and weight, \(w\) kg, of each of a random sample of \(20\) boys of this age and he calculates the following statistics.\[\sum {w = 340,\sum {h = 2002,\sum {{w^2} = 5830} } } ,\sum {{h^2} = 201124} ,\sum {hw = 34150} \]

a.(i)     Calculate the correlation coefficient for this sample.

(ii)     Calculate the \(p\)-value of your result and interpret it at the \(1\% \) level of significance.[8]

b.(i)     Calculate the equation of the least squares regression line of \(w\) on \(h\) .

(ii)     The height of a randomly selected boy of this age of \(90\) cm. Estimate his weight.[3]

 
▶️Answer/Explanation

Markscheme

(i)     \(r = \frac{{34150 – 340 \times \frac{{2002}}{{20}}}}{{\sqrt {\left( {5830 – \frac{{{{340}^2}}}{{20}}} \right)} \left( {201124 – \frac{{{{2002}^2}}}{{20}}} \right)}}\)     (M1)(A1)

Note: Accept equivalent formula.

 

\( = 0.610\)     A1

 

(ii)     (\(T = R \times \sqrt {\frac{{n – 2}}{{1 – {R^2}}}} \) has the t-distribution with \(n – 2\) degrees of freedom)

\(t = 0.6097666 \ldots \sqrt {\frac{{18}}{{1 – 0.6097666{ \ldots ^2}}}} \)     M1

\( = 3.2640 \ldots \)     A1

\({\rm{DF}} = 18\)     A1

\(p{\rm{ – value}} = 0.00215 \ldots \)     A1

this is less than \(0.01\), so we conclude that there is a positive association between heights and weights of boys of this age     R1

 

[8 marks]

a.

(i)     the equation of the regression line of \(w\) on \(h\) is

\(w – \frac{{340}}{{20}} = \left( {\frac{{20 \times 34150 – 340 \times 2002}}{{20 \times 201124 – {{2002}^2}}}} \right)\left( {h – \frac{{2002}}{{20}}} \right)\)     M1

\(w = 0.160h + 0.957\)     A1

(ii) putting \(h = 90\) , \(w = 15.4\) (kg)     A1

Note: Award M0A0A0 for calculation of \(h\) on \(w\).

[3 marks]

b.

Question

The random variables \(X\), \(Y\) follow a bivariate normal distribution with product moment correlation coefficient \(\rho \). The following table gives a random sample from this distribution.

(a)     Determine the value of \(r\), the product moment correlation coefficient of this sample.

(b)     (i)     Write down hypotheses in terms of \(\rho \) which would enable you to test whether or not \(X\) and \(Y\) are independent.

(ii)     Determine the p-value of the above sample and state your conclusion at the 5% significance level. Justify your answer.

(c)     (i)     Determine the equation of the regression line of \(y\) on \(x\).

(ii)     State whether or not this equation can be used to obtain an accurate prediction of the value of \(y\) for a given value of \(x\). Give a reason for your answer.

▶️Answer/Explanation

Markscheme

(a)     \(r =  – 0.163\)     A2

[2 marks]

 

(b)     (i)     \({{\text{H}}_0}:\rho  = 0:{{\text{H}}_1}:\rho  \ne 0\)     A1

(ii)     \(t = r\sqrt {\frac{{n – 2}}{{1 – {r^2}}}}  =  – 0.468 \ldots \)     (A1)

\({\text{DF}} = 8\)     (A1)

\(p{\text{-value}} = 2 \times 0.326 \ldots  = 0.652\)   A1

since \(0.652 > 0.05\), we accept \({{\text{H}}_0}\)     R1

Note: Award (A1)(A1)A0 if the p-value is given as \(0.326\) without prior working.

Note: Follow through their p-value for the R1.

[5 marks]

 

(c)     (i)     \(y =  – 0.257x + 5.22\)     A1

Note: Accept answers which round to \(–0.26\) and \(5.2\).

(ii)     no, because \(X\) and \(Y\) have been shown to be independent (or equivalent)     A1

[2 marks]

Question

[Maximum mark: 6]
Consider the following data

The regression line for y on x is y = 2.2x – 0.5
(a) Solve the equation above for x to find an expression in the form x = ay+b [2]
(b) Find the equation x = cy+d of the regression line for x on y. [2]
(c) Describe the advantage of the linear equation in (b). [2]

▶️Answer/Explanation

Ans:
(a) y = 2.2x – 0.5 ⇔ y + 0.5 = 2.2x ⇔ x = 0.455 y + 0.227
(b) x = 0.423y + 0.385
(c) The relation in (a) is in fact the inverse function of the line y = 2.2x – 0.5
If y is given, the answer in (c) gives a more reliable estimation of x.

Scroll to Top