# IBDP Maths analysis and approaches Topic: SL 4.1-Concepts of population, sample, random sample HL Paper 1

## Question

At a skiing competition the mean time of the first three skiers is 34.1 seconds. The time for the fourth skier is then recorded and the mean time of the first four skiers is 35.0 seconds. Find the time achieved by the fourth skier.

## Markscheme

total time of first 3 skiers $$= 34.1 \times 3 = 102.3$$     (M1)A1

total time of first 4 skiers $$= 35.0 \times 4 = 140.0$$     A1

time taken by fourth skier $$= 140.0 – 102.3 = 37.7{\text{ (seconds)}}$$     A1 [4 marks]

## Question

The discrete random variable X has the following probability distribution, where p is a constant.

a. Find the value of p.[2]

b.i. Find μ, the expected value of X.[2]

b.ii. Find P(X > μ). [2]

## Markscheme

a.equating sum of probabilities to 1 (p + 0.5 − p + 0.25 + 0.125 + p3 = 1)       M1

p3 = 0.125 = $$\frac{1}{8}$$

p= 0.5      A1

[2 marks]

b.i

μ = 0 × 0.5 + 1 × 0 + 2 × 0.25 + 3 × 0.125 + 4 × 0.125       M1

= 1.375 $$\left( { = \frac{{11}}{8}} \right)$$     A1

[2 marks]

b.ii.

P(X > μ) = P(X = 2) + P(X = 3) + P(X = 4)      (M1)

= 0.5       A1

Note: Do not award follow through A marks in (b)(i) from an incorrect value of p.

Note: Award M marks in both (b)(i) and (b)(ii) provided no negative probabilities, and provided a numerical value for μ has been found.

[2 marks]

### Question

Consider the data $$x_1,x_2,x_3, …, x_n,$$ with mean $$\overline{x}$$, and standard deviation $$s$$.

1. If each number is increased by $$k$$,
1. show that the new mean is $$\overline{x}+k$$ (i.e. it is also increased by $$k$$)
2. show that the new standard deviation is $$s$$(i.e. it remains the same)
2. If each number is multiplied by $$k$$
1. show that the new mean is $$k\overline{x}$$ (i.e. it is also multiplied by $$k$$)
2. show that the new standard deviation is $$ks$$(i.e. it is also multiplied by $$k$$)
3. write down the relation between the original and the new variance.

Ans:

1. $$\overline{x}_{new}=\frac{\sum_{i=1}^{n}(x_i+k)}{n}=\frac{\sum_{i=1}^{n}x_i+\sum_{i=1}^{n}k}{n}=\frac{\sum_{i=1}^{n}x_i}{n}+\frac{\sum_{i=1}^{n}k}{n}=\overline{x}+\frac{kn}{n}=\overline{x}+k$$
$$s_{new}=\sqrt{\frac{\sum_{i=1}^{n} ((x_i+k)-(\overline{x}-k))^2 }{n}}=\sqrt{\frac{\sum_{i=1}^{n}(x_i-\overline{x})^2)}{n}}=s$$
2. $$\overline{x}_{new}=\frac{\sum_{i=1}^{n}(kx_i)}{n}=\frac{k\sum_{i=1}^{n}(x_i)}{n}=k\frac{\sum_{i=1}^{n}(x_i)}{n}=k\overline{x}$$
$$s_{new}=\sqrt{\frac{\sum_{i=1}^{n}(kx_i-k\overline{x})^2}{n}}=\sqrt{\frac{k^2\sum_{i=1}^{n}(x_i-\overline{x})^2}{n}}=k\sqrt{\frac{\sum_{i=1}^{n}(x_i-\overline{x})^2}{n}}=ks$$
$$s_{new}=ks$$⇒$$s^2_{new}=k^2s^2$$
so the original variance is multiplied by $$k^2$$.

### Question

Consider the following data

 x 1 2 3 4 y 2 3 7 8
1. Find the mean and the variance for the values of $$x$$.
2. Find the mean and the variance for the values of $$y$$.
3. Find the correlation coefficient $$r$$.
4. Describe the relation between $$x$$ and $$y$$.
5. Find the equation $$y = ax+b$$ of the regression line for $$y$$ on $$x$$.
6. Find the equation $$x = cy+d$$ of the regression line for $$x$$ on $$y$$.
7. Find the inverse of the function in question $$(e);$$ Is it the function in question $$(f)$$?

Ans:

1. $$\mu_x = 2.5, \sigma_x^2= 1.11803^2= 1.25$$
2. $$\mu_y = 5, \sigma_y^2=2.54951^2= 6.5$$
3. 0.965
4. strong positive
5. $$y = 2.2x – 0.5$$
6. $$x = 0.423y + 0.385$$
7. $$y = 2.2x – 0.5$$ ⇔ $$y + 0.5 = 2.2x$$ ⇔ $$x = 0.455 y + 0.227$$. They are different

### Question

A fair six-sided die, with sides numbered 1, 1, 2, 3, 4, 5 is thrown. Find the mean and variance of the score.

Extra question
Find the new mean and the new variance if

1. Each number is increased be 3
2. Each number is multiplied by 3
3. Each number is increased be $$a$$, where $$a$$ is a positive integer
4. Each number is multiplied by $$a$$, where $$a$$ is a positive integer

Ans:

$$\mu=\frac{1}{6}(1+1+2+3+4+5)=\frac{8}{3}(=2.67)$$
$$\sigma=\frac{1}{6}(1+1+4+9+16+25)-\frac{64}{9}=\frac{20}{9}(=2.22)$$

Extra questions

1. $$\mu = 3+2.67=5.67$$ (increases by 3), $$\sigma = 2.22$$ (unchanged), so $$\sigma^2 = (2.22)^2$$
2. $$\mu = 3(2.67)$$ (multiplied by 3), $$\sigma = 3(2.22)$$ (multiplied by 3), so $$\sigma^2 = 9(2.22)^2$$
3. $$\mu = a+2.67$$ (increases by a), $$\sigma = 2.22$$ (unchanged), so $$\sigma^2 = (2.22)^2$$
4. $$\mu = 2.67a$$ (multiplied by $$a$$), $$\sigma = 2.22a$$ (multiplied by $$a$$), so $$\sigma^2 = (2.22)^2a ^2$$

### Question

Consider the data set {$$k − 2, k, k +1, k + 4$$}, where $$k$$∈$$R$$ .

1. Find the mean of this data set in terms of $$k$$.
Each number in the above data set is now decreased by 3.
2. Find the mean of this new data set in terms of $$k$$.

Ans:

1. Use of $$\overline{x}=\frac{\sum_{i=1}^{4}x_i}{n}$$
$$\overline{x}=\frac{(k-2)+k+(k+1)+(k+4)}{4}$$
$$\overline{x}=\frac{4k+3}{4}\left ( =k+\frac{3}{4} \right )$$
2. Either attempting to find the new mean or subtracting 3 from their $$\overline{x}$$
$$\overline{x}=\frac{4k+3}{4}-3\left ( =\frac{4k-9}{4},k-\frac{9}{4} \right )$$

### Question

Consider the six numbers $$2, 3, 6, 9, a$$ and $$b$$. The mean of the numbers is 6 and the variance is 10. Find the value of $$a$$ and of $$b$$, if $$a < b$$.

Ans:

$$\overline{x}=\frac{2+3+6+9+a+b}{6}$$
$$=\frac{20+a+b}{6}=6$$
⇒$$a+b=16$$
variance $$=\frac{\sum_{i=1}^{6}(x_i-6)^2}{6}$$
$$=\frac{4^2+3^2+0^2+3^2+(a-6)^2+(b-6)^2}{6}=10$$
⇒$$(a-6)^2+(b-6)^2=26$$
⇒$$(a-6)^2+(10-a)^2=26$$
Therefore, $$a=5,b=11$$

### Question

A sample of discrete data is drawn from a population and given as $$66, 72, 65 ,70, 69 ,73, 65, 71, 75.$$
Find

1. the interquartile range;
2. the mean of the population;
3. the variance of the population.

Ans:

1. EITHER
Interquartile Range $$72.5-65.5=7$$
OR
$$72-66=6$$
2. $$\mu=\frac{626}{9}=69.6$$
3. $$\sigma^2=(3.402)^2=11.6$$

### Question

A random sample drawn from a large population contains the following data $$6.2, 7.8, 12.1, 9.7, 5.2, 14.8, 16.2, 3.7$$ .
Calculate

1. the mean;
2. the variance.
Extra question
Find the median, the interquartile range, any outliers.

Ans:

Either by GDC or by the formulas
$$\mu = 9.46$$
$$\sigma^2 = (4.267)^2 = 18.2$$
Extra questions

1. The median is $$8.75$$
2. The interquartile range is $$13.45 – 5.7 = 7.75$$
3. Outliers = values
less than $$5.7-(1.5)(7.75) = -5.925$$ or
greater that $$13.45 + (1.5)(7.75) = 25.075$$
Hence, there are no outliers.

### Question

A teacher drives to school. She records the time taken on each of 20 days. She finds that

$$\sum_{i=1}^{20}x_i=626$$ and $$\sum_{i=1}^{20}x_i^2=19780.8$$, where $$x_i$$ denotes the time, in minutes, taken on the $$i^{th}$$ day.
For this period, calculate

1. the mean time taken to drive to school;
2. the variance of the time taken to drive to school.
Extra Question
Find the sum $$\sum_{x=1}^{20}(x_i-\mu)^2$$

Ans:

1. $$\overline{x}=\frac{626}{20}=31.3$$
2. $$\frac{\sum x^2}{n}-\overline{x}^2=\frac{19780.8}{20}-31.3^2=9.35$$

### Question

Ten numbers have mean 9 and standard deviation 2. Find the sum of their squares.

Ans:

$$\sigma^2=\frac{\sum_{x=1}^{10}x_i^2}{10}-\mu^2$$⇒$$2^2=\frac{\sum_{x=1}^{10}x_i^2}{10}-9^2$$⇒$$\frac{\sum_{x=1}^{10}x_i^2}{10}=85$$⇒$$\sum_{x=1}^{10}x_i^2=850$$

### Question

Consider the 10 data items $$x_1,x_2,…..,x_{10}$$. Given that $$\sum_{i=1}^{10}x_i^2=1341$$ and the standard deviation
is 6.9, find the value of $$\overline{x}$$ .

Ans:

$$6.9^2=47.61$$
$$47.61=\frac{1341}{10}-\overline{x}^2$$
$$\overline{x}^2=86.49$$
$$\overline{x}=\pm 9.3$$

### Question

Twenty candidates sat an examination in French. The sum of their marks was 826 and the sum of the squares of their marks was 34132. Two candidates sat the examination late and their marks were $$a$$ and $$b$$. The new mean and variance were calculated, giving the following results:
mean = 42 and variance = 32.
Find a set of possible values of $$a$$ and $$b$$.

Ans:

$$\frac{a+b+826}{22}=42$$⇒$$a+b=98$$
$$\frac{a^2+b^2+34132}{22}-42^2=32$$
⇒$$a^2+b^2=5380$$
Solving the system of equations
$$a=32,b=66$$ (or vice versa)

### Question

[use GDC]
A machine produces packets of sugar. The weights in grams of thirty packets chosen at random are shown below.

 Weight (g) 29.6 29.7 29.8 29.9 30 30.1 30.2 30.3 Frequency 2 3 4 5 7 5 3 1

Find

1. the mean of this sample.
2. the variance of this sample .

Ans:

$$\mu=29.9$$
$$\sigma^2=(0.1802)^2=0.325$$

### Question

In a sample of 50 boxes of light bulbs, the number of defective light bulbs per box is shown below.

 Number of defective light bulbs per box 0 1 2 3 4 5 6 Number of boxes 7 3 15 11 6 5 3
1. Calculate the median number of defective light bulbs per box.
2. Calculate the mean number of defective light bulbs per box.

Ans:

1. Recognising how to find the median $$(25.5^{th}item)$$
median = $$\frac{2+3}{2}=2.5$$
2. $$\overline{x}=\frac{(0×7)+(1×3)+(2×15)+(3×11)+(4×6)+(5×5)+(6×3)}{50}$$
$$\overline{x}=\frac{133}{50}=2.66$$

### Question

A sample of 70 batteries was tested to see how long they last. The results were:

 Time (hours) Number of batteries (frequency) $$0 ≤ t ≤ 10$$ $$2$$ $$10 ≤ t ≤ 20$$ $$4$$ $$20 ≤ t ≤ 30$$ $$8$$ $$30 ≤ t ≤ 40$$ $$9$$ $$40 ≤ t ≤ 50$$ $$12$$ $$50 ≤ t ≤ 60$$ $$13$$ $$60 ≤ t ≤ 70$$ $$8$$ $$70 ≤ t ≤ 80$$ $$7$$ $$80 ≤ t ≤ 90$$ $$6$$ $$90 ≤ t ≤100$$ $$1$$ Total $$70$$

Find

1. the mean;
2. the standard deviation.

Ans:

1. mean = $$49.9$$
2. standard deviation = $$21.4$$ hours.

### Question

In a rental property business, the profits in Euros per year for 50 properties are shown in the following cumulative table.

 Profit $$(x)$$ Number of properties with profit less than $$(x)$$ $$-10000$$ $$0$$ $$-5000$$ $$3$$ $$0$$ $$7$$ $$5000$$ $$22$$ $$10000$$ $$39$$ $$15000$$ $$44$$ $$20000$$ $$50$$

For this population of 50 properties, calculate an estimate for the standard deviation of the profit.

Ans: For re-arranging the information (e.g. the following table)

 Profit -7500 -2500 2500 7500 12500 17500 Frequency 3 4 15 17 5 6

$$\sigma=6422.61….$$
$$\sigma=6420$$

### Question

The 80 applicants for a Sports Science course were required to run 800 metres and their times were recorded. The results were used to produce the following cumulative frequency graph

Estimate

1. the median;
2. the interquartile range.

Ans:

1. Median = $$135$$
2. $$Q1= 130, Q2 = 141$$ $$IQ$$ Range $$= 141 – 130 = 11$$

### Question

The cumulative frequency curve below indicates the amount of time 250 students spend eating lunch.

1. Estimate the number of students who spend between 20 and 40 minutes eating lunch.
2. If 20 % of the students spend more than $$x$$ minutes eating lunch, estimate the value of $$x$$.

Extra questions

1. estimate the 40th percentile
2. estimate the 80th percentile

Ans:

1. 28 spent less than 20 minutes
184 spent less than 40 minutes
156 spent between 20 and 40 minutes
2. 80 % spent less than $$x$$ minutes
80 % of 250 = 200
$$x$$ = 44 minutes
Extra questions
1. 40th percentile = 30 minutes
2. 80th percentile = 44 minutes

### Question

A recruitment company tests the aptitude of 100 applicants applying for jobs in engineering.
Each applicant does a puzzle and the time taken, $$t$$, is recorded. The cumulative frequency curve for these data is shown below.

Using the cumulative frequency curve,

1. write down the value of the median;
2. determine the interquartile range;
3. complete the frequency table below.
 Time to complete puzzle in seconds Number of applicants $$20 < t ≤ 30$$ $$30< t ≤ 35$$ $$35< t ≤ 40$$ $$40< t ≤ 45$$ $$45< t ≤ 50$$ $$50< t ≤ 60$$ $$60< t ≤ 80$$

Ans:

1. Median = 50 (allow 49 or 51)
2. Interquartile range = 60 – 40 = 20 (allow 59, 61, 39, 41 and corresponding difference)
3.  Time to complete puzzle in seconds Number of applicants $$20 < t ≤ 30$$ 10 $$30< t ≤ 35$$ 6 $$35< t ≤ 40$$ 9 $$40< t ≤ 45$$ 11 $$45< t ≤ 50$$ 14 $$50< t ≤ 60$$ 25 $$60< t ≤ 80$$ 25

Notes: Allow $$\pm1$$ on each entry provided total adds up to $$100$$.

### Question

The following is the cumulative frequency diagram for the heights of 30 plants given in
centimetres.

1. Use the diagram to estimate the median height.
2. Complete the following frequency table.
 Height (h) Frequency $$0\le h <5$$ 4 $$5\le h <10$$ 9 $$10\le h <15$$ $$15\le h <20$$ $$20\le h <25$$
3. Hence estimate the mean height.

Ans:

1. Median = 11 (accept any values in range 11 to 11.5 inclusive)
2.  Height (h) Frequency $$0\le h <5$$ 4 $$5\le h <10$$ 9 $$10\le h <15$$ 8 $$15\le h <20$$ 5 $$20\le h <25$$ 4
3. Using mid values in calculation of the mean
$$\left ( =\frac{355}{50}\right)=11.83 (=11.8)$$

### Question

The box-and-whisker plots shown represent the heights of female students and the heights of male students at a certain school.

1. What percentage of female students are shorter than any male students?
2. What percentage of male students are shorter than some female students?