Home / IB Mathematics AHL 4.12 Design of valid data collection methods AI HL Paper 2- Exam Style Questions

IB Mathematics AHL 4.12 Design of valid data collection methods AI HL Paper 2- Exam Style Questions- New Syllabus

Question

Goran is studying the frequency of sightings of a specific bird species over the 50 weeks following September 1st. He has gathered data, presented in the table below.

Number of sightings012345More than 5
Number of weeks816138320

The mean number of sightings per week, derived from this sample, is 1.76.

(a) Determine the unbiased estimate of the population variance for the number of sightings per week.

(b) Goran hypothesizes that the data conforms to a Poisson distribution. Explain how your result from part (a) supports this hypothesis.

(c) Goran conducts a significance test at the 5% level to verify his hypothesis. His null hypothesis is that \( X \sim \text{Po}(m) \), where \( X \) represents the number of sightings per week. He estimates \( m \) as the sample mean, 1.76, and calculates the expected frequencies for the 50 weeks following September 1st. These are shown to two decimal places in the table below.

Number of
sightings
012345 or more
Expected
frequencies
8.6015.1413.327.82jk

(c)(i) Compute the value of \( j \).

(c)(ii) Compute the value of \( k \).

(d) Provide a reason why Goran should combine categories for his significance test.

(e) Specify the degrees of freedom for the test.

(f) Calculate the \( p \)-value for the test.

(g) State the conclusion of the test and justify your decision.

▶️ Answer/Explanation
Markscheme

(a) Unbiased estimate of variance

The unbiased sample variance is given by

\( s^2 = \frac{1}{n-1} \sum f_i (x_i – \bar{x})^2 \)

where \( n = 50 \) (total weeks), \( \bar{x} = 1.76 \) (sample mean), \( x_i \) is the number of sightings, and \( f_i \) is the frequency of each sighting. Let’s calculate step-by-step:

  • For \( x = 0 \): \( (0 – 1.76)^2 \times 8 = 3.0976 \times 8 = 24.7808 \)
  • For \( x = 1 \): \( (1 – 1.76)^2 \times 16 = 0.5776 \times 16 = 9.2416 \)
  • For \( x = 2 \): \( (2 – 1.76)^2 \times 13 = 0.0576 \times 13 = 0.7488 \)
  • For \( x = 3 \): \( (3 – 1.76)^2 \times 8 = 1.5376 \times 8 = 12.3008 \)
  • For \( x = 4 \): \( (4 – 1.76)^2 \times 3 = 5.0176 \times 3 = 15.0528 \)
  • For \( x = 5 \): \( (5 – 1.76)^2 \times 2 = 10.4976 \times 2 = 20.9952 \)
  • For \( x > 5 \): \( 0 \) (frequency = 0)

Sum of \( \sum f_i (x_i – \bar{x})^2 = 24.7808 + 9.2416 + 0.7488 + 12.3008 + 15.0528 + 20.9952 = 83.1200 \)

Then, \( s^2 = \frac{83.1200}{50 – 1} = \frac{83.1200}{49} \approx 1.6963 \)

Rounded to 3 significant figures: \( s^2 \approx 1.70 \quad (s \approx 1.302) \).

Note: A slightly different calculation with divisor \( n \) would give \( s^2 = 1.662 \).

[3 marks]

(b) Comparing mean and variance

The mean number of sightings per week is given as \( 1.76 \), and the variance from part (a) is approximately \( 1.70 \).

For a Poisson distribution, the mean and variance are theoretically equal. The close values of \( 1.76 \) (mean) and \( 1.70 \) (variance) suggest that the data aligns with the characteristics of a Poisson distribution, supporting Goran’s hypothesis.

[1 mark]

(c)(i) Computing \( j \)

The expected frequency for \( X = 4 \) is calculated using the Poisson probability formula:

\( P(X = 4) = \frac{e^{-1.76} \cdot (1.76)^4}{4!} \)

Step-by-step:

  • \( e^{-1.76} \approx 0.172374 \)
  • \( (1.76)^4 = 1.76 \cdot 1.76 \cdot 1.76 \cdot 1.76 \approx 9.593676 \)
  • \( 4! = 24 \)
  • \( P(X = 4) = \frac{0.172374 \cdot 9.593676}{24} \approx \frac{1.6539}{24} \approx 0.068913 \)

Expected frequency \( j = 50 \times 0.068913 \approx 3.44565 \), rounded to 3.44.

[2 marks]

(c)(ii) Computing \( k \)

The expected frequency for \( X \geq 5 \) is derived by subtracting the sum of expected frequencies for \( X = 0 \) to \( X = 4 \) from the total number of weeks (50):

\( k = 50 – (8.60 + 15.14 + 13.32 + 7.82 + j) \)

Using \( j \approx 3.44 \):

  • \( 8.60 + 15.14 + 13.32 + 7.82 + 3.44 = 48.32 \)
  • \( k = 50 – 48.32 = 1.68 \)

Thus, \( k \approx 1.68 \).

[3 marks]

(d) Reason for combining categories

In a chi-squared test, each expected frequency should be at least 5. Since the expected values for \( X = 4 \) (\( j \approx 3.44 \)) and \( X \geq 5 \) (\( k \approx 1.68 \)) are less than 5, these categories need to be combined to maintain the validity of the test.

[1 mark]

(e) Degrees of freedom

The degrees of freedom are calculated as:

\( df = \text{(number of categories)} – 1 – \text{(number of estimated parameters)} \)

After combining categories, there are 5 categories (0, 1, 2, 3, 4 or more), minus 1 for the constraint that frequencies sum to \( n \), minus 1 for the estimated mean parameter \( m \):

\( df = 5 – 1 – 1 = 3 \).

[1 mark]

(f) \( p \)-value of the test

The chi-squared statistic is computed as:

\( \chi^2 = \sum \frac{(O – E)^2}{E} \).

Carrying out the calculation gives \( \chi^2 \approx 0.991 \).

Therefore, the \( p \)-value is very large (\( p \approx 0.99 \)).

[2 marks]

(g) Conclusion

Since \( p = 0.99 \gt 0.05 \), the result is not significant at the 5% level.

We do not reject \( H_0 \). Hence, there is sufficient evidence to say the weekly sightings follow a Poisson distribution with mean 1.76.

[2 marks]

[Total: 15 marks]

Scroll to Top