Question 1
| Min | \( Q_1 \) | Median | \( Q_3 \) | Max | Mean | Std. Dev. |
|---|---|---|---|---|---|---|
| \( 2.10 \) | \( 4.39 \) | \( 5.43 \) | \( 6.12 \) | \( 13.45 \) | \( 5.54 \) | \( 1.64 \) |
(c) The researchers believe that streams with higher dissolved oxygen concentration are generally healthier for wildlife. Which streams are generally healthier for wildlife, those with water temperature colder than \( 8^\circ \mathrm{C} \) or those with water temperature warmer than \( 8^\circ \mathrm{C} \)? Using characteristics of the distribution of dissolved oxygen concentration for temperatures colder than \( 8^\circ \mathrm{C} \) and characteristics of the distribution of dissolved oxygen concentration for temperatures warmer than \( 8^\circ \mathrm{C} \), justify your answer.
Most-appropriate topic codes (CED):
• Topic 1.8 — Graphical Representations of Summary Statistics (five-number summary & boxplots) — part (b)
• Topic 1.9 — Comparing Distributions of a Quantitative Variable — part (c)
▶️ Answer/Explanation
(a)
From the histogram for streams colder than \( 8^\circ \mathrm{C} \):
• Shape: Unimodal and skewed left (longer tail toward lower values \( \approx 2\text{–}6 \,\mathrm{mg}/\ell \)).
• Center: Median between \( 11 \) and \( 12 \,\mathrm{mg}/\ell \) (tallest bars around \( 10\text{–}12 \)).
• Spread: Approximate range \( \approx 14 – 2 = 12 \,\mathrm{mg}/\ell \). Quartiles appear near \( Q_1 \in (10,11) \) and \( Q_3 \in (12,13) \), so \( \mathrm{IQR} \approx 2 \,\mathrm{mg}/\ell \).
• Unusual features: Several potential low outliers in the \( 2\text{–}3 \), \( 4\text{–}5 \), and \( 5\text{–}6 \) bins because these are far below \( Q_1 – 1.5\,\mathrm{IQR} \approx 10 – 1.5(2) = 7 \,\mathrm{mg}/\ell \).
Therefore, the distribution is described as unimodal, skewed left, with median \( 11\text{–}12 \,\mathrm{mg}/\ell \), IQR \( \approx 2 \,\mathrm{mg}/\ell \), and possible low outliers.
(b)
Use the five-number summary to draw the boxplot for warmer than \( 8^\circ \mathrm{C} \):
• Minimum \( = 2.10 \), \( Q_1 = 4.39 \), Median \( = 5.43 \), \( Q_3 = 6.12 \), Maximum \( = 13.45 \).
• Compute \( \mathrm{IQR} = Q_3 – Q_1 = 6.12 – 4.39 = 1.73 \).
• Fences for outliers (not drawn, but for reference): lower \( Q_1 – 1.5\mathrm{IQR} = 4.39 – 1.5(1.73) = 1.885 \) (so \( 2.10 \) is not beyond the fence); upper \( Q_3 + 1.5\mathrm{IQR} = 6.12 + 1.5(1.73) = 8.715 \) (values above this would be flagged, but the instruction says not to indicate outliers).
• The box spans \( 4.39 \) to \( 6.12 \) with a median line at \( 5.43 \); whiskers extend to \( 2.10 \) (left) and \( 13.45 \) (right).![]()
(c)
If higher dissolved oxygen implies healthier streams, then colder streams are generally healthier because:
• Center comparison: Colder streams have a larger center (median between \( 11 \) and \( 12 \,\mathrm{mg}/\ell \)) than warmer streams (median \( 5.43 \,\mathrm{mg}/\ell \)).
• Shape: Colder distribution is skewed left (most values high with a few small ones), whereas the warmer distribution is right-skewed given the very long upper whisker to \( 13.45 \).
• Spread: Both show similar middle-spread (colder \( \mathrm{IQR} \approx 2 \,\mathrm{mg}/\ell \); warmer \( \mathrm{IQR} = 1.73 \,\mathrm{mg}/\ell \)), but this does not overturn the much higher center for colder streams.
Hence, using center (and acknowledging shape and spread), colder streams better satisfy the researchers’ criterion.
Question 2
• Experimental units
• Treatments
• Response variable
Most-appropriate topic codes (CED):
• TOPIC 3.6: Selecting an Experimental Design
• TOPIC 3.7: Inference and Experiments
▶️ Answer/Explanation
(a)
• Experimental units: The \(60\) individual driveways.
• Treatments: Concrete with fibers and concrete without fibers.
• Response variable: The severity of cracks recorded after one year on a scale of \(0\) to \(10\).
(b)
First, number each of the \(60\) driveways from \(1\) to \(60\). Then, use a random number generator to select \(30\) unique integers from \(1\) to \(60\). The driveways corresponding to these \(30\) numbers will be assigned to receive concrete with fibers. The remaining \(30\) driveways will receive concrete without fibers.
(c)
The benefit of randomly assigning the treatments is that it allows the developer to draw a cause-and-effect conclusion. Since the driveways were randomly assigned, it helps to ensure that the two treatment groups are roughly balanced on all other potential confounding variables at the beginning of the experiment. Therefore, if there is a statistically significant difference in the severity of cracks, the developer can conclude that it was the type of concrete (the treatment) that caused the difference.
Question 3
| Cash prize, \( x \) | \( \$1 \) | \( \$5 \) | \( \$10 \) | \( \$20 \) | \( \$50 \) | \( \$100 \) |
|---|---|---|---|---|---|---|
| Probability of cash prize, \( P(X = x) \) | \( P(X = \$1) \) | \( 0.2 \) | \( 0.05 \) | \( 0.05 \) | \( 0.01 \) | \( 0.01 \) |
Most-appropriate topic codes (CED):
• TOPIC 4.8: Mean and Standard Deviation of Random Variables — parts (c), (d)
• TOPIC 4.5: Conditional Probability — part (b)
▶️ Answer/Explanation
(a)
(i) The sum of all probabilities must equal \( 1 \). Therefore:
\( P(X = \$1) = 1 – (0.2 + 0.05 + 0.05 + 0.01 + 0.01) = 1 – 0.32 = 0.68 \)
The proportion of bath fizzies that contain \( \$1 \) is \( \boxed{0.68} \).
(ii) The proportion of bath fizzies that contain at least \( \$10 \) is:
\( P(X \geq \$10) = P(X = \$10) + P(X = \$20) + P(X = \$50) + P(X = \$100) \)
\( P(X \geq \$10) = 0.05 + 0.05 + 0.01 + 0.01 = 0.12 \)
The proportion is \( \boxed{0.12} \).
(b) Using the conditional probability formula:
\( P(X = \$100 \mid X \geq \$10) = \frac{P(X = \$100 \text{ and } X \geq \$10)}{P(X \geq \$10)} = \frac{P(X = \$100)}{P(X \geq \$10)} \)
\( P(X = \$100 \mid X \geq \$10) = \frac{0.01}{0.12} = \frac{1}{12} \approx 0.0833 \)
The probability is \( \boxed{\frac{1}{12}} \) or approximately \( \boxed{0.0833} \).
(c) The expected value is calculated as:
\( E(X) = \sum x \cdot P(X = x) \)
\( E(X) = 1(0.68) + 5(0.2) + 10(0.05) + 20(0.05) + 50(0.01) + 100(0.01) \)
\( E(X) = 0.68 + 1.00 + 0.50 + 1.00 + 0.50 + 1.00 = 4.68 \)
The expected value is \( \boxed{\$4.68} \).
Interpretation: If many bath fizzies are selected, the average cash prize per bath fizzy will be approximately \( \$4.68 \).
(d) Convert the expected value from dollars to euros:
\( E(X)_{\text{euros}} = 4.68 \times 0.89 = 4.1652 \)
Rounded to two decimal places, the expected value in euros is \( \boxed{4.17} \) euros.
Question 4
| \( n \) | Mean | Standard Deviation | |
|---|---|---|---|
| Placebo | \( 19 \) | \( 5.421 \) | \( 2.987 \) |
| Omega-3 | \( 19 \) | \( 3.632 \) | \( 1.739 \) |
| Difference (placebo minus omega-3) | \( 19 \) | \( 1.789 \) | \( 2.485 \) |
Most-appropriate topic codes (CED):
• TOPIC 7.4: Setting Up a Test for a Population Mean — section 1
• TOPIC 7.7: Justifying a Claim About a Population Mean Based on a Confidence Interval — alternative approach
▶️ Answer/Explanation
State: We will conduct a paired t-test for a population mean difference.
Let \( \mu_d \) = true mean difference (placebo minus omega-3) in irritability scores for all patients with this medical condition.
\( H_0: \mu_d = 0 \)
\( H_a: \mu_d > 0 \)
\( \alpha = 0.05 \)
Plan: We verify the conditions:
• Random: Treatments were randomly assigned to weeks for each patient.
• 10% Condition: Not needed since this is an experiment.
• Normal/Large Sample: The sample size (\( n = 19 \)) is less than 30, but the boxplot of differences shows an approximately symmetric distribution with no outliers, so the sampling distribution of \( \bar{x}_d \) should be approximately normal.
Do: Test statistic:
\( t = \frac{\bar{x}_d – \mu_0}{s_d/\sqrt{n}} = \frac{1.789 – 0}{2.485/\sqrt{19}} \approx \frac{1.789}{0.570} \approx 3.138 \)
Degrees of freedom: \( df = 19 – 1 = 18 \)
p-value: \( P(t > 3.138) \approx 0.0028 \)
Conclude: Since p-value \( = 0.0028 < \alpha = 0.05 \), we reject \( H_0 \). There is convincing statistical evidence that the true mean difference (placebo minus omega-3) in irritability scores for all patients with this medical condition is greater than zero. This supports the researcher’s claim that the omega-3 supplement decreases the mean irritability score.
Question 5
predicted weight \( = -350.3 + 3.7455(\text{chest circumference})\)
(i) Using the equation of the least-squares regression line, calculate the predicted weight for this male tule elk. Show your work.
(ii) Calculate the residual for this male tule elk. Show your work.
predicted weight \( = -350.3 + 3.7455(\text{chest circumference})\)
\[
\begin{aligned}
H_0 &: \beta = 4.5 \\
H_a &: \beta \ne 4.5
\end{aligned}
\]
The test statistic was calculated to be \(3.408\). Assume all conditions for inference were met.
(i) Determine the p-value of the test.
(ii) At a significance level of \(\alpha=0.05\), what conclusion should the wildlife biologist make regarding the slope of the population regression line for male tule elk? Justify your response.
Most-appropriate topic codes (CED):
• TOPIC 2.7: Residuals
• TOPIC 2.8: Least Squares Regression
• TOPIC 9.5: Carrying Out a Test for the Slope of a Regression Model
▶️ Answer/Explanation
(a)
There is a strong, positive, and roughly linear relationship between the chest circumference and weight of male tule elk. There are no obvious outliers or influential points that deviate from the linear pattern.
(b)
(i) Predicted weight \( = -350.3 + 3.7455(145.9) \approx -350.3 + 546.47 \approx 196.17\) kg.
\(\boxed{\text{Predicted weight} \approx 196.17 \text{ kg}}\)
(ii) Residual = Actual – Predicted
Residual \( = 204.3 – 196.17 = 8.13\) kg.
\(\boxed{\text{Residual} \approx 8.13 \text{ kg}}\)
(c)
For each additional centimeter of chest circumference, the predicted weight of a male tule elk increases by approximately \(3.7455\) kilograms.
(d)
(i) We need to find the p-value for a t-test statistic of \(3.408\) with degrees of freedom \(df = n-2 = 30-2 = 28\). Since the alternative hypothesis is two-sided (\(H_a: \beta \ne 4.5\)), the p-value is \(2 \times P(t_{28} > 3.408)\).
Using a t-table or calculator, this probability is approximately \(0.002\).
\(\boxed{\text{p-value} \approx 0.002}\)
(ii) Because the p-value (\(\approx 0.002\)) is less than the significance level (\(\alpha=0.05\)), the wildlife biologist should reject the null hypothesis. There is convincing statistical evidence that the slope of the population regression line for male tule elk is different from \(4.5\) kg/cm.
Section ii
Part B
Part B
Question 6
(i) Calculate the probability that the sample mean amount of gold applied to a random sample of \(n=2\) necklaces will be greater than \(303\) mg.
(ii) Suppose Cleo took a random sample of \(n=2\) necklaces that resulted in a sample mean amount of gold applied of \(303\) mg. Would that result indicate that the population mean amount of gold being applied by the machine is different from \(300\) mg? Justify your answer without performing an inference procedure.
(i) Describe the sampling distribution of the sample range for random samples of size \(n=2\) from a normal distribution with standard deviation \(\sigma=5\), as shown in Graph I.
(ii) Describe how the sampling distribution of the sample range for samples of size \(n=2\) changes as the value of the population standard deviation increases.
(i) Consider Cleo’s range of \(10\) mg from the sample of size \(n=2\). If the machine is working properly with a standard deviation of \(5\) mg, is a sample range of \(10\) mg unusual? Justify your answer.
(ii) Do Cleo’s sample mean of \(303\) mg and range of \(10\) mg indicate that the machine is not working properly? Explain your answer.
Most-appropriate topic codes (CED):
• TOPIC 5.7 — Sampling Distributions for Sample Means: (b)(i), (b)(ii), (d)(ii)
• TOPIC 5.1 — Introducing Statistics: Why Is My Sample Not Like Yours?: (c)(i), (c)(ii), (d)(i), (d)(ii)
▶️ Answer/Explanation
(a)
We are looking for \(P(296 < X < 304)\) for a normal distribution with \(\mu=300\) and \(\sigma=5\).
– Z-score for \(296\): \(z = \frac{296-300}{5} = -0.8\)
– Z-score for \(304\): \(z = \frac{304-300}{5} = 0.8\)
\(P(-0.8 < Z < 0.8) = P(Z < 0.8) – P(Z < -0.8) \approx 0.7881 – 0.2119 = 0.5762\).
\(\boxed{P \approx 0.576}\)
(b)
(i) The sampling distribution of \(\bar{x}\) for \(n=2\) is normal with \(\mu_{\bar{x}}=300\) and \(\sigma_{\bar{x}} = \frac{5}{\sqrt{2}} \approx 3.536\).
We need \(P(\bar{x} > 303)\).
\(z = \frac{303-300}{3.536} \approx 0.848\).
\(P(Z > 0.848) \approx 0.198\).
(ii) No, this result would not provide convincing evidence. A sample mean of \(303\) mg is not unusual because the probability of observing a sample mean this far or farther from \(300\) mg (\(P(\bar{x} \ge 303)\) or \(P(\bar{x} \le 297)\)) is large (\(2 \times 0.198 = 0.396\)).
(c)
(i) The sampling distribution of the sample range shown in Graph I is skewed to the right. The center is approximately \(6\) mg, and the values are spread from \(0\) mg to about \(25\) mg.
(ii) As the population standard deviation (\(\sigma\)) increases, the sampling distribution of the sample range becomes more spread out and its center (mean) increases.
(d)
(i) No, a sample range of \(10\) mg is not unusual. According to Graph I (\(\sigma=5\)), there is a notable proportion of the distribution at or above a sample range of \(10\) mg (approximately \(20\%\)), so this value occurs frequently by chance.
(ii) No, Cleo’s results do not indicate the machine is not working properly. As shown in part (b), a sample mean of \(303\) mg is not unusual. As shown in part (d-i), a sample range of \(10\) mg is also not unusual. Since neither the sample mean nor the sample range is an unusual result, there is no convincing evidence that the machine is not working properly.
