IB Mathematics SL 4.4 Linear correlation of bivariate data AI HL Paper 1- Exam Style Questions- New Syllabus

Question

A research group is utilizing a mathematical model to estimate the subjective well-being of various nations. A metric \(x\) is derived from measurable factors such as healthcare accessibility and social safety nets, with the assumption that higher \(x\) values correspond to increased happiness.

To evaluate the accuracy of this model, a study is conducted across six nations: \(A\), \(B\), \(C\), \(D\), \(E\), and \(F\). In these locations, happiness is measured through direct surveys, resulting in a score \(y\) out of \(10\), where higher scores indicate greater well-being.

To choose the participating nations, the global population of countries is categorized into three distinct tiers based on economic wealth, and two countries are selected at random from each tier.

(a) Identify the specific sampling method used in this study.

The data collected from the surveys and the corresponding model values are presented in the table below.

Country	A	B	C	D	E	F
Value from the model (\(x\))	12.3	15.2	14.1	18.5	20.1	19.2
Happiness score (\(y\))	5.2	7.3	6.2	6.9	8.0	7.2

The researchers consider the model a reliable predictor if the Pearson’s product-moment correlation coefficient, \(r\), exceeds \(0.8\).

(b) (i) Calculate the value of \(r\).
(ii) Determine whether the model is a valid predictor based on the researchers’ criteria.

Consider a country with a model value of \(x=17.2\).

(d) Use your regression equation to estimate the happiness score for this country.

Most-appropriate topic codes:

• SL 4.1: Concepts of population, sample, and sampling techniques — part (a)
• SL 4.4: Pearson’s product-moment correlation coefficient (\(r\)) — part (b)
• SL 4.10: Regression line and making predictions — part (c) and Part (d)

▶️ Answer/Explanation

Detailed solution

(a)
The population was divided into sub-groups (tiers) based on wealth, and a random sample was extracted from each. This is Stratified sampling .

(b)
(i) Inputting the data into a Graphic Display Calculator (GDC) for linear regression:
\(r = 0.853899\dots\)
\(r = 0.854\) (to 3 s.f.) .
(ii) Since the researchers’ requirement is \(r > 0.8\) and \(0.854 > 0.8\), the model is a valid predictor.

(c)
Using the GDC output:
\(a = 0.265851\dots\) and \(b = 2.39573\dots\)
The regression equation is \(y = 0.266x + 2.40\) .

(d)
Substitute \(x = 17.2\) into the equation:
\(y = 0.265851\dots(17.2) + 2.39573\dots = 6.96837\dots\)
Estimated Happiness score = 6.97.

IB Mathematics SL 4.4 Linear correlation of bivariate data AI HL Paper 1- Exam Style Questions- New Syllabus

Question

Resources

Members

Company