# IB DP Maths Topic 7.7 Introduction to bivariate distributions HL Paper 3

## Question

The following table gives the average yield of olives per tree, in kg, and the rainfall, in cm, for nine separate regions of Greece. You may assume that these data are a random sample from a bivariate normal distribution, with correlation coefficient $$\rho$$.

A scientist wishes to use these data to determine whether there is a positive correlation between rainfall and yield.

(a)     State suitable hypotheses.

(b)     Determine the product moment correlation coefficient for these data.

(c)     Determine the associated p-value and comment on this value in the context of the question.

(d)     Find the equation of the regression line of y on x.

(e)     Hence, estimate the yield per tree in a tenth region where the rainfall was 19 cm.

(f)     Determine the angle between the regression line of y on x and that of x on y . Give your answer to the nearest degree.

## Markscheme

(a)     $${H_0}:\rho = 0$$     A1

$${H_1}:\rho > 0$$     A1

[2 marks]

(b)     0.853     A2

Note:     Accept any answer that rounds to 0.85.

[2 marks]

(c)     p-value = 0.00173 (1-tailed)     A1

Note:     Accept any answer that rounds to 0.0017.

Accept any answer that rounds to 0.0035 obtained from 2-tailed test.

strong evidence to reject the hypothesis that there is no correlation between rainfall and yield or to accept the hypothesis that there is correlation between rainfall and yield     R1

Note:     Follow through the p-value for the conclusion.

[2 marks]

(d)     $$y = 1.78x + 40.5$$     A1A1

Note:     Accept numerical coefficients that round to 1.8 and 41.

[2 marks]

(e)     $$y = 1.77 \ldots (19) + 14.5 \ldots$$     M1

74.3     A1

Note:     Accept any answer that rounds to 74 or 75.

[2 marks]

(f)     the gradient of the regression line y on x is 1.78 or equivalent     A1

the regression line of x on y is $$x = 0.409y – 12.2$$     (A1)

the gradient of the regression line x on y is $$\frac{1}{{0.409}}{\text{ }}( = 2.44)$$     (M1)A1

calculate $$\arctan (2.44) – \arctan (1.78)$$     (M1)

angle between regression lines is 7 degrees     A1

Note:     Accept any answer which rounds to ±7 degrees.

[6 marks]

Total [16 marks]

[N/A]

## Question

The random variables $$U,{\text{ }}V$$ follow a bivariate normal distribution with product moment correlation coefficient $$\rho$$.

A random sample of 12 observations on U, V is obtained to determine whether there is a correlation between U and V. The sample product moment correlation coefficient is denoted by r. A test to determine whether or not UV are independent is carried out at the 1% level of significance.

State suitable hypotheses to investigate whether or not $$U$$, $$V$$ are independent.

[2]
a.

Find the least value of $$|r|$$ for which the test concludes that $$\rho \ne 0$$.

[6]
b.

## Markscheme

$${{\text{H}}_0}:\rho = 0;{\text{ }}{{\text{H}}_1}:\rho \ne 0$$     A1A1

[2 marks]

a.

$$\nu = 10$$     (A1)

$${t_{0.005}} = 3.16927 \ldots$$     (M1)(A1)

we reject $${{\text{H}}_0}:\rho = 0$$ if $$\left| t \right| > 3.16927 \ldots$$     (R1)

attempting to solve $$\left| r \right|\sqrt {\frac{{10}}{{1 – {r^2}}}} > 3.16927 \ldots$$ for $$\left| r \right|$$     M1

Note:     Allow = instead of >.

(least value of $$\left| r \right|$$ is) 0.708 (3 sf)     A1

Note:     Award A1M1A0R1M1A0 to candidates who use a one-tailed test. Award A0M1A0R1M1A0 to candidates who use an incorrect number of degrees of freedom or both a one-tailed test and incorrect degrees of freedom.

Note: Possible errors are

10 DF 1-tail, $$t = 2.763 \ldots$$, least value $$=$$ 0.658

11 DF 2-tail, $$t = 3.105 \ldots$$, least value $$=$$ 0.684

11 DF 1-tail, $$t = 2.718 \ldots$$, least value $$=$$ 0.634.

[6 marks]

b.

[N/A]

a.

[N/A]

b.

## Question

The random variables X , Y follow a bivariate normal distribution with product moment correlation coefficient ρ.

A random sample of 11 observations on X, Y was obtained and the value of the sample product moment correlation coefficient, r, was calculated to be −0.708.

The covariance of the random variables U, V is defined by

Cov(U, V) = E((U − E(U))(V − E(V))).

State suitable hypotheses to investigate whether or not a negative linear association exists between X and Y.

[1]
a.

Determine the p-value.

[3]
b.i.

State your conclusion at the 1 % significance level.

[1]
b.ii.

Show that Cov(U, V) = E(UV) − E(U)E(V).

[3]
c.i.

Hence show that if U, V are independent random variables then the population product moment correlation coefficient, ρ, is zero.

[3]
c.ii.

## Markscheme

H0 : ρ = 0; H1 ρ < 0       A1

[1 mark]

a.

$$t = – 0.708\sqrt {\frac{{11 – 2}}{{1 – {{\left( { – 0.708} \right)}^2}}}} \,\, = \,\,\left( { – 3.0075 \ldots } \right)$$       (M1)

degrees of freedom = 9        (A1)

P(T < −3.0075…) = 0.00739       A1

Note: Accept any answer that rounds to 0.0074.

[3 marks]

b.i.

reject H0 or equivalent statement       R1

Note: Apply follow through on the candidate’s p-value.

[1 mark]

b.ii.

Cov(U, V) + E((U − E(U))(V − E(V)))

= E(UV − E(U)V − E(V)+ E(U)E(V))       M1

= E(UV) − E(E(U)V) − E(E(V)U) + E(E(U)E(V))       (A1)

= E(UV) − E(U)E(V) − E(V)E(U) + E(U)E(V)       A1

Cov(U, V) = E(UV) − E(U)E(V)       AG

[3 marks]

c.i.

E(UV) = E(U)E(V) (independent random variables)       R1

⇒Cov(U, V) = E(U)E(V) − E(U)E(V) = 0      A1

hence, ρ = $$\frac{{{\text{Cov}}\left( {U,\,V} \right)}}{{\sqrt {{\text{Var}}\left( U \right)\,{\text{Var}}\left( V \right)} }} = 0$$     A1AG

Note: Accept the statement that Cov(U,V) is the numerator of the formula for ρ.

Note: Only award the first A1 if the R1 is awarded.

[3 marks]

c.ii.

[N/A]

a.

[N/A]

b.i.

[N/A]

b.ii.

[N/A]

c.i.

[N/A]

c.ii.

## Question

The students in a class take an examination in Applied Mathematics which consists of two papers. Paper 1 is in Mechanics and Paper 2 is in Statistics. The marks obtained by the students in Paper 1 and Paper 2 are denoted by $$(x,{\text{ }}y)$$ respectively and you may assume that the values of $$(x,{\text{ }}y)$$ form a random sample from a bivariate normal distribution with correlation coefficient $$\rho$$ . The teacher wishes to determine whether or not there is a positive association between marks in Mechanics and marks in Statistics.

State suitable hypotheses.

[1]
a.

The marks obtained by the 12 students who sat both papers are given in the following table.

(i)     Determine the product moment correlation coefficient for these data and state its p-value.

(ii)     Interpret your p-value in the context of the problem.

[5]
b.

George obtained a mark of 63 on Paper 1 but was unable to sit Paper 2 because of illness. Predict the mark that he would have obtained on Paper 2.

[4]
c.

Another class of 16 students sat examinations in Physics and Chemistry and the product moment correlation coefficient between the marks in these two subjects was calculated to be 0.524. Using a 1 % significance level, determine whether or not this value suggests a positive association between marks in Physics and marks in Chemistry.

[5]
d.

## Markscheme

$${{\text{H}}_0}:\rho = 0;{\text{ }}{{\text{H}}_1}:\rho > 0$$     A1

[1 mark]

a.

(i)     correlation coefficient = 0.905     A2

p-value $$= 2.61 \times {10^{ – 5}}$$     A2

(ii)     very strong evidence to indicate a positive association between marks in Mechanics and marks in Statistics     R1

[5 marks]

b.

the regression line of y on x is $$y = 8.71 + 0.789x$$     (M1)A1

George’s estimated mark on Paper 2 $$= 8.71 + 0.789 \times 63$$     (M1)

= 58     A1

[4 marks]

c.

$$t = r\sqrt {\frac{{n – 2}}{{1 – {r^2}}}} = 2.3019 \ldots$$     M1A1

degrees of freedom = 14     (A1)

p-value $$= 0.0186 \ldots$$     A1

at the 1 % significance level, this does not indicate a positive association between the marks in Physics and Chemistry     R1

[5 marks]

d.

[N/A]

a.

[N/A]

b.

[N/A]

c.

[N/A]

d.

## Question

The students in a class take an examination in Applied Mathematics which consists of two papers. Paper 1 is in Mechanics and Paper 2 is in Statistics. The marks obtained by the students in Paper 1 and Paper 2 are denoted by $$(x,{\text{ }}y)$$ respectively and you may assume that the values of $$(x,{\text{ }}y)$$ form a random sample from a bivariate normal distribution with correlation coefficient $$\rho$$ . The teacher wishes to determine whether or not there is a positive association between marks in Mechanics and marks in Statistics.

State suitable hypotheses.

[1]
a.

The marks obtained by the 12 students who sat both papers are given in the following table.

(i)     Determine the product moment correlation coefficient for these data and state its p-value.

(ii)     Interpret your p-value in the context of the problem.

[5]
b.

George obtained a mark of 63 on Paper 1 but was unable to sit Paper 2 because of illness. Predict the mark that he would have obtained on Paper 2.

[4]
c.

Another class of 16 students sat examinations in Physics and Chemistry and the product moment correlation coefficient between the marks in these two subjects was calculated to be 0.524. Using a 1 % significance level, determine whether or not this value suggests a positive association between marks in Physics and marks in Chemistry.

[5]
d.

## Markscheme

$${{\text{H}}_0}:\rho = 0;{\text{ }}{{\text{H}}_1}:\rho > 0$$     A1

[1 mark]

a.

(i)     correlation coefficient = 0.905     A2

p-value $$= 2.61 \times {10^{ – 5}}$$     A2

(ii)     very strong evidence to indicate a positive association between marks in Mechanics and marks in Statistics     R1

[5 marks]

b.

the regression line of y on x is $$y = 8.71 + 0.789x$$     (M1)A1

George’s estimated mark on Paper 2 $$= 8.71 + 0.789 \times 63$$     (M1)

= 58     A1

[4 marks]

c.

$$t = r\sqrt {\frac{{n – 2}}{{1 – {r^2}}}} = 2.3019 \ldots$$     M1A1

degrees of freedom = 14     (A1)

p-value $$= 0.0186 \ldots$$     A1

at the 1 % significance level, this does not indicate a positive association between the marks in Physics and Chemistry     R1

[5 marks]

d.

[N/A]

a.

[N/A]

b.

[N/A]

c.

[N/A]

d.