Question 1
(i) Positive:
(ii) Linear:
(iii) Strong:
The data collected from the wolves were used to create the least-squares equation \(\hat{y}=-16.46+35.02x.\)
Most-appropriate topic codes (CED):
• TOPIC 2.8: Least Squares Regression
• TOPIC 2.7: Residuals
▶️ Answer/Explanation
(a)
(i) Positive: This means that as the length of a wolf increases, its weight also tends to increase.
(ii) Linear: This means that the data points on the scatterplot tend to follow a straight-line pattern. For a constant increase in length, there is a roughly constant increase in weight.
(iii) Strong: This means that the data points are tightly clustered around the linear trend, indicating that the linear model is a good fit for the data.
(b)
The slope of \(35.02\) means that for each additional meter in a wolf’s length, the predicted weight of the wolf increases by approximately \(35.02\) kilograms.
(c)
First, calculate the predicted weight for the wolf using the regression equation.
Predicted weight (\(\hat{y}\)) \( = -16.46 + 35.02(1.4) = -16.46 + 49.028 = 32.568\) kg.
Next, use the formula for a residual: Residual = Actual weight – Predicted weight.
\(-9.67 = \text{Actual weight} – 32.568\)
Actual weight \( = 32.568 – 9.67 = 22.898\) kg.
\(\boxed{\text{Actual weight} \approx 22.9 \text{ kg}}\)
Question 2
Most-appropriate topic codes (CED):
• TOPIC 6.3: Justifying a Claim Based on a Confidence Interval for a Population Proportion — part (b)
▶️ Answer/Explanation
(a)
State: We want to construct a \(95\%\) confidence interval for \(p\), the true proportion of all customers who ask for a water cup and fill it with a soft drink.
Plan: The appropriate procedure is a one-sample z-interval for a population proportion. We must check the conditions:
1. Random: The problem states that the manager selected a random sample of \(80\) customers. This condition is met.
2. Large Counts (for Normality): We must check if the number of successes and failures are both at least \(10\).
- Number of successes = \(n\hat{p} = 23\). This is \( \ge 10 \).
- Number of failures = \(n(1-\hat{p}) = 80 – 23 = 57\). This is \( \ge 10 \).
Since both conditions are met, we can proceed with constructing the interval.
Do: The sample proportion is \(\hat{p} = \frac{23}{80} = 0.2875\). For a \(95\%\) confidence level, the critical value is \(z^* = 1.96\).
The confidence interval is calculated as: \[ \hat{p} \pm z^* \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \] \[ 0.2875 \pm 1.96 \sqrt{\frac{0.2875(1-0.2875)}{80}} \] \[ 0.2875 \pm 1.96 \sqrt{\frac{(0.2875)(0.7125)}{80}} \] \[ 0.2875 \pm 1.96(0.0506) \] \[ 0.2875 \pm 0.0992 \] The interval is \((0.1883, 0.3867)\).
Conclude: We are \(95\%\) confident that the interval from \(0.1883\) to \(0.3867\) captures the true proportion of all customers who, having asked for a water cup, will fill the cup with a soft drink from the beverage fountain.
(b)
To create an interval estimate for the total cost, we multiply the endpoints of the confidence interval for the proportion by the total number of customers who ask for a water cup (\(3,000\)) and the cost per customer (\(\$0.25\)).
Lower bound of cost estimate: \[ 0.1883 \times 3,000 \times \$0.25 = \$141.225 \] Upper bound of cost estimate: \[ 0.3867 \times 3,000 \times \$0.25 = \$290.025 \] Thus, a \(95\%\) confidence interval for the cost to the restaurant in June is from \(\$141.23\) to \(\$290.03\).
Question 3
Most-appropriate topic codes (CED):
• TOPIC 4.5: Conditional Probability — part (c)
• TOPIC 4.7: Introduction to Random Variables and Probability Distributions — part (a)
▶️ Answer/Explanation
(a)
Let \( X \) represent the diameter of a melon from Distributor J.
\( X \sim N(133, 5) \)
We want \( P(X > 137) \).
\( z = \frac{137 – 133}{5} = \frac{4}{5} = 0.8 \)
\( P(z > 0.8) = 1 – P(z < 0.8) = 1 – 0.7881 = 0.2119 \)
The probability is \( \boxed{0.2119} \).
(b)
Let \( G \) be the event that a melon has diameter greater than 137 mm.
From part (a): \( P(G \mid J) = 0.2119 \)
Given: \( P(G \mid K) = 0.8413 \)
\( P(J) = 0.7 \), \( P(K) = 0.3 \)
Using the law of total probability:
\( P(G) = P(G \mid J)P(J) + P(G \mid K)P(K) \)
\( P(G) = (0.2119)(0.7) + (0.8413)(0.3) = 0.14833 + 0.25239 = 0.40072 \)
The probability is \( \boxed{0.4007} \).
![]()
(c)
We want \( P(J \mid G) \).
Using Bayes’ theorem:
\( P(J \mid G) = \frac{P(G \mid J)P(J)}{P(G)} = \frac{(0.2119)(0.7)}{0.4007} = \frac{0.14833}{0.4007} = 0.3701 \)
The probability is \( \boxed{0.3701} \).
Question 4
Most-appropriate topic codes (CED):
• TOPIC 1.6: Describing the Distribution of a Quantitative Variable — part (b-i)
• TOPIC 2.3: Statistics for Two Categorical Variables — part (b-ii)
▶️ Answer/Explanation
(a)
For chemical Z, the median percentage is similar across all three sites, approximately \( 7\% \) for each site. However, the variability differs substantially among the sites. Site II has the smallest range (approximately \( 6\% \) to \( 8\% \)), Site I has a moderate range (approximately \( 4\% \) to \( 10\% \)), and Site III has the largest range (approximately \( 3\% \) to \( 11\% \)).
(b)
(i) The piece most likely originated from \( \boxed{\text{Site III}} \).
Calculating the approximate sum ranges for each site:
• Site I: Minimum sum ≈ \( 6 + 11 + 4 = 21\% \), Maximum sum ≈ \( 8 + 15 + 10 = 33\% \)
• Site II: Minimum sum ≈ \( 5 + 1.9 + 6 = 12.9\% \), Maximum sum ≈ \( 7 + 4 + 8 = 19\% \)
• Site III: Minimum sum ≈ \( 5 + 6 + 3 = 14\% \), Maximum sum ≈ \( 7.5 + 8 + 11 = 26.5\% \)
Only Site III has a range (14% to 26.5%) that includes 20.5%.
Chemical Concentration Ranges (by Site)
| Chemical | Site I | Site II | Site III | |||
|---|---|---|---|---|---|---|
| Min | Max | Min | Max | Min | Max | |
| X | \(6\) | \(8\) | \(5\) | \(7\) | \(5\) | \(7.5\) |
| Y | \(11\) | \(15\) | \(1.9\) | \(4\) | \(6\) | \(8\) |
| Z | \(4\) | \(10\) | \(6\) | \(8\) | \(3\) | \(11\) |
| Sum | \(21\) | \(33\) | \(12.9\) | \(19\) | \(14\) | \(26.5\) |
(ii) \( \boxed{\text{Chemical Y}} \) would be most useful for identifying the site.
Chemical Y shows the clearest distinction among the three sites with no overlap in the boxplots. Site I has high percentages (approximately 11% to 15%), Site II has low percentages (approximately 1.9% to 4%), and Site III has moderate percentages (approximately 6% to 8%). In contrast, chemicals X and Z show substantial overlap among the sites, making them less useful for distinguishing the origin.
Question 5
| Age-Group (years) | 20 to 29 | 30 to 39 | 40 to 49 | 50 to 59 | Total |
|---|---|---|---|---|---|
| Women | 46 | 40 | 21 | 12 | 119 |
| Men | 53 | 23 | 9 | 3 | 88 |
| Total | 99 | 63 | 30 | 15 | 207 |
Most-appropriate topic codes (CED):
• TOPIC 8.5: Carrying Out a Chi-Square Test for Goodness of Fit — entire question
• TOPIC 8.6: Concluding a Test for a Population Proportion — conclusion
▶️ Answer/Explanation
We will conduct a chi-square test for independence using the four-step process.
State:
\( H_0 \): There is no association between age-group and gender in the diagnosis of schizophrenia.
\( H_a \): There is an association between age-group and gender in the diagnosis of schizophrenia.
Plan:
We will use a chi-square test for independence.
Conditions:
1. Random sample: Satisfied – stated in the problem
2. Expected counts: All expected counts should be at least 5
| Age-Group | 20-29 | 30-39 | 40-49 | 50-59 |
|---|---|---|---|---|
| Women (Expected) | 56.91 | 36.22 | 17.25 | 8.62 |
| Men (Expected) | 42.09 | 26.78 | 12.75 | 6.38 |
All expected counts are greater than 5, so conditions are satisfied.
Do:
Test statistic: \( \chi^2 = \sum \frac{(O – E)^2}{E} \)
Calculations:
Women 20-29: \( \frac{(46 – 56.91)^2}{56.91} = 2.093 \)
Women 30-39: \( \frac{(40 – 36.22)^2}{36.22} = 0.395 \)
Women 40-49: \( \frac{(21 – 17.25)^2}{17.25} = 0.817 \)
Women 50-59: \( \frac{(12 – 8.62)^2}{8.62} = 1.322 \)
Men 20-29: \( \frac{(53 – 42.09)^2}{42.09} = 2.830 \)
Men 30-39: \( \frac{(23 – 26.78)^2}{26.78} = 0.534 \)
Men 40-49: \( \frac{(9 – 12.75)^2}{12.75} = 1.105 \)
Men 50-59: \( \frac{(3 – 6.38)^2}{6.38} = 1.788 \)
Total: \( \chi^2 = 2.093 + 0.395 + 0.817 + 1.322 + 2.830 + 0.534 + 1.105 + 1.788 = 10.884 \)
Degrees of freedom: \( (4 – 1) \times (2 – 1) = 3 \)
p-value: \( P(\chi^2 \geq 10.884) = 0.012 \)
Conclude:
Since the p-value (0.012) is less than \( \alpha = 0.05 \), we reject the null hypothesis. There is convincing statistical evidence of an association between age-group and gender in the diagnosis of schizophrenia.
Question 6
| Arrangement A Treatment: Man 1, Man 2 Control: Woman 1, Woman 2 | Arrangement B Treatment: Man 1, Woman 1 Control: Man 2, Woman 2 | Arrangement C Treatment: Man 1, Woman 2 Control: Man 2, Woman 1 |
| Arrangement D Treatment: Woman 1, Woman 2 Control: Man 1, Man 2 | Arrangement E Treatment: Man 2, Woman 1 Control: Man 1, Woman 2 | Arrangement F Treatment: Man 2, Woman 2 Control: Man 1, Woman 1 |
| Arrangement | A | B | C | D | E | F |
|---|---|---|---|---|---|---|
| Probability |
| Arrangement A Treatment: Man 1, Man 2 Control: Woman 1, Woman 2 | Arrangement B Treatment: Man 1, Woman 1 Control: Man 2, Woman 2 | Arrangement C Treatment: Man 1, Woman 2 Control: Man 2, Woman 1 |
| Arrangement D Treatment: Woman 1, Woman 2 Control: Man 1, Man 2 | Arrangement E Treatment: Man 2, Woman 1 Control: Man 1, Woman 2 | Arrangement F Treatment: Man 2, Woman 2 Control: Man 1, Woman 1 |
| Arrangement | A | B | C | D | E | F |
|---|---|---|---|---|---|---|
| Probability |
Most-appropriate topic codes (CED):
• TOPIC 4.3: Introduction to Probability — parts (a), (b)
• TOPIC 3.6: Selecting an Experimental Design — part (c)
▶️ Answer/Explanation
(a) Sequential Coin Flip Method
(i) The probabilities for each arrangement:
Arrangements and Probabilities for Coin Outcomes
| Arrangement | A | B | C | D | E | F |
|---|---|---|---|---|---|---|
| Coin outcomes | TT | THT | THH | HH | HTH | HTT |
| Calculation | \(\left(\tfrac{1}{2}\right)\!\left(\tfrac{1}{2}\right)\) | \(\left(\tfrac{1}{2}\right)\!\left(\tfrac{1}{2}\right)\!\left(\tfrac{1}{2}\right)\) | \(\left(\tfrac{1}{2}\right)\!\left(\tfrac{1}{2}\right)\!\left(\tfrac{1}{2}\right)\) | \(\left(\tfrac{1}{2}\right)\!\left(\tfrac{1}{2}\right)\) | \(\left(\tfrac{1}{2}\right)\!\left(\tfrac{1}{2}\right)\!\left(\tfrac{1}{2}\right)\) | \(\left(\tfrac{1}{2}\right)\!\left(\tfrac{1}{2}\right)\!\left(\tfrac{1}{2}\right)\) |
| Probability | \(\tfrac{1}{4}\) | \(\tfrac{1}{8}\) | \(\tfrac{1}{8}\) | \(\tfrac{1}{4}\) | \(\tfrac{1}{8}\) | \(\tfrac{1}{8}\) |
Justification: Arrangements A and D occur when the first two people get the same assignment (TT or HH), with probability \( \left(\frac{1}{2}\right)^2 = \frac{1}{4} \) each. Arrangements B, C, E, and F occur when the first two people get different assignments, requiring a third flip to determine the final assignment, with probability \( \frac{1}{8} \) each.
(ii) Man 1 and Man 2 are assigned to the same group in Arrangements A and D.
Probability = \( P(A) + P(D) = \frac{1}{4} + \frac{1}{4} = \frac{1}{2} \).
\( \boxed{\frac{1}{2}} \)
(b) Chip Method
(i) The probabilities for each arrangement:
Arrangements and Probabilities for Chip Outcomes
| Arrangement | A | B | C | D | E | F |
|---|---|---|---|---|---|---|
| Chip outcomes | TT | TCT | TCC | CC | CTC | CTT |
| Calculation | \(\left(\tfrac{2}{4}\right)\!\left(\tfrac{1}{3}\right)\) | \(\left(\tfrac{2}{4}\right)\!\left(\tfrac{2}{3}\right)\!\left(\tfrac{1}{2}\right)\) | \(\left(\tfrac{2}{4}\right)\!\left(\tfrac{2}{3}\right)\!\left(\tfrac{1}{2}\right)\) | \(\left(\tfrac{2}{4}\right)\!\left(\tfrac{1}{3}\right)\) | \(\left(\tfrac{2}{4}\right)\!\left(\tfrac{2}{3}\right)\!\left(\tfrac{1}{2}\right)\) | \(\left(\tfrac{2}{4}\right)\!\left(\tfrac{2}{3}\right)\!\left(\tfrac{1}{2}\right)\) |
| Probability | \(\tfrac{1}{6}\) | \(\tfrac{1}{6}\) | \(\tfrac{1}{6}\) | \(\tfrac{1}{6}\) | \(\tfrac{1}{6}\) | \(\tfrac{1}{6}\) |
Justification: There are \( \binom{4}{2} = 6 \) equally likely ways to choose which two people get the treatment group, and each arrangement corresponds to one of these choices.
(ii) Man 1 and Man 2 are assigned to the same group in Arrangements A and D.
Probability = \( P(A) + P(D) = \frac{1}{6} + \frac{1}{6} = \frac{1}{3} \).
\( \boxed{\frac{1}{3}} \)
(c)
The \( \boxed{\text{chip method}} \) should be used.
Justification: The chip method gives equal probability (\( \frac{1}{6} \)) to all possible arrangements, ensuring balanced treatment groups. The coin method gives higher probability (\( \frac{1}{4} \) each) to arrangements where the first two people (both students) are in the same group, which could result in imbalanced groups with respect to students and teachers. Since students and teachers may have different food preferences, this imbalance could confound the results, making it difficult to determine whether differences in lunch preference are due to the treatment or the role (student vs. teacher).
