AP Statistics 2025 FRQ Question and Answer

Question 1

The manager of an automotive company is interested in comparing the gas mileages for cars manufactured in Country A and cars manufactured in Country B. The manager selected a random sample of 100 cars manufactured in Country A and a random sample of 100 cars manufactured in Country B. The gas mileages for each sample, in miles per gallon (mpg), are summarized in the boxplots.

(A) Compare the distributions of gas mileage for the sample of cars manufactured in Country A and the sample of cars manufactured in Country B.

(B) For the distribution of gas mileage for the sample of cars manufactured in Country A, would you expect the mean to be greater than 18 mpg, less than 18 mpg, or equal to 18 mpg? Justify your answer.

(C) The manager will create a new boxplot with the combined data from the sample of cars manufactured in Country A and the sample of cars manufactured in Country B.
i. What is the range of the combined data set? Justify your answer.
ii. What is a possible value of the median of the combined data set? Justify your answer by referencing the boxplots shown.

Most-appropriate topic codes (CED):

• TOPIC 1.9: Comparing Distributions — part (a)
• TOPIC 1.10: Mean vs Median (Skewness) — part (b)

▶️ Answer/Explanation

Solution

(A)
• Center: Median of B (~32 mpg) > Median of A (18 mpg).
• Spread: Range of A (24 mpg) > Range of B (22 mpg). IQR of B > IQR of A .
• Shape/Outliers: Country A is right-skewed with a high outlier. Country B is roughly symmetric.

(B)
Greater than 18 mpg.
The distribution for Country A is skewed to the right. In a right-skewed distribution, the mean is pulled toward the tail, making it greater than the median (18) .

(C)(i)
26 mpg.
Range = Combined Max – Combined Min.
Max is from Country B (40). Min is from Country A (14).
\(40 – 14 = 26\) mpg .

(C)(ii)
24 mpg.
The combined sample has 200 cars. The median is the average of the 100th and 101st values.
Q3 of Country A is 24 (75% of 100 \(\le\) 24). Q1 of Country B is 24 (25% of 100 \(\le\) 24).
Thus, ~100 values are \(\le 24\) and ~100 values are \(\ge 24\), so the median is 24 .

Question 2

Aphids are tiny insects that feed on plants such as cabbage plants. A farmer wants to reduce the number of aphids in a cabbage field. A river is located 100 meters south of the cabbage field. The farmer divides the field into 25 regions of equal size, as shown in the diagram. Each region has approximately the same number of cabbage plants.

The farmer would like to estimate the proportion of cabbage plants in the field that are affected by aphids and believes that the extent of aphid damage is greater for the regions in the cabbage field closer to the river. To obtain the estimate, the farmer is considering three sampling methods.

Sampling method I: Select region 3, which is closest to the farmer’s house and farthest from the river. Examine every cabbage plant in the region for aphid damage.

Sampling method II: Randomly select one row (A, B, C, D, or E). For every region in the selected row, examine every cabbage plant for aphid damage.

Sampling method III: Randomly select one region from each of rows A, B, C, D, and E. For each selected region, examine every cabbage plant for aphid damage.

(A) Explain whether sampling method I is an appropriate sampling method for the farmer to use to estimate the proportion of cabbage plants in the field that are damaged by aphids.

(B) Using sampling method II, the farmer randomly selected row E and examined every cabbage plant in row E. If the farmer’s belief is correct, determine whether the selection of row E is likely to provide an overestimate or an underestimate of the proportion of cabbage plants in the field that are damaged by aphids. Justify your answer.

(C) Using the information provided in the diagram of the cabbage field, describe how to implement sampling method III, which requires a random selection of one region from each of rows A, B, C, D, and E.

Most-appropriate topic codes (CED):

• TOPIC 3.2: Sampling Methods — part (a), (c)
• TOPIC 3.3: Potential Sources of Bias — part (b)

▶️ Answer/Explanation

Solution

(A)
Not appropriate.
It is a convenience sample. Region 3 is farthest from the river. If damage is related to the river, this region will likely have less damage than the field average, leading to an underestimate .

(B)
Overestimate.
Row E is closest to the river. If the farmer’s belief is correct (damage is greater near the river), the proportion of damaged plants in Row E will be higher than the true population proportion .

(C)
Stratified Sampling Procedure:
1. Label regions 1-5 in Row A.
2. Randomly select one number (1-5) using a random number generator.
3. Repeat independently for Rows B, C, D, and E (selecting one region from each row).
4. Examine all plants in the 5 selected regions .

Question 3

Ms. Fey is a manager at a restaurant. To improve the dining experience for her customers, she uses a digital music service to create a playlist of songs that will be played in the restaurant. The playlist contains 1,000 songs and consists of four different types of music in the following quantities: 200 country songs, 400 pop songs, 100 rock songs, and 300 jazz songs. The digital music service will select songs at random from the playlist to be played in the restaurant. Any song can be replayed at any time.

(A)

i. Suppose one song is selected at random to be played. What is the probability that the song is a rock song? Show your work.
ii. Suppose two songs are selected at random to be played. What is the probability that both songs are rock songs? Show your work.

(B) In every one-hour period, 20 songs will be played at random and any song can be replayed at any time. Ms. Fey is interested in how many rock songs will be played in a typical one-hour period.
i. Define the random variable of interest to Ms. Fey, and state how the random variable is distributed.
ii. What is the expected value for the random variable in part B (i)? Show your work.

(C) Recall that in every one-hour period, 20 songs will be played at random and any song can be replayed at any time.
i. Determine the probability that 4 or more rock songs in a particular one-hour period will be played. Show your work.
ii. Suppose 4 rock songs are played during a particular one-hour period. Does this provide strong evidence that the song selection process was not truly random? Justify your answer without performing an inference procedure.

Most-appropriate topic codes (CED):

• TOPIC 4.3: Probability Rules — part (a)
• TOPIC 4.10: Binomial Distribution — part (b)

▶️ Answer/Explanation

Concise solution

(A)(i)
\(P(\text{Rock}) = \frac{100}{1000} = 0.10\) .

(A)(ii)
\(P(\text{Both Rock}) = 0.10 \times 0.10 = 0.01\) (Independent events) .

(B)(i)
Let \(X\) = # of rock songs in 20.
Distribution: Binomial with \(n=20\), \(p=0.10\) .

(B)(ii)
\(E(X) = np = 20(0.10) = 2\) rock songs .

(C)(i)
\(P(X \ge 4) = 1 – P(X \le 3)\)
\(P(X \ge 4) = 1 – 0.867 = 0.133\) .

(C)(ii)
No.
\(P(\ge 4 \text{ rock songs}) = 13.3\%\). This is not small enough (< 5%) to be significant evidence against randomness .

Question 4

A software application (app) lets users enter questions to receive answers in the form of images, texts, or videos. Research indicates that 22 percent of high school students in Country W use the app to help them with their homework at least once per week. Karen is an AP Statistics student in Country W at a high school that has more than 2,000 students. She believes the proportion of all students at her school who use the app to help them with their homework at least once per week is greater than the proportion for her country. To investigate her belief, she took a simple random sample of 130 students from her school and found that 38 of the sampled students use the app to help them with their homework at least once per week.

Is there convincing statistical evidence, at a 0.05 significance level, to support Karen’s belief? Justify your answer with the appropriate inference procedure.

Most-appropriate topic codes (CED):

• TOPIC 6.4: One-Sample z-Test for a Population Proportion

▶️ Answer/Explanation

Concise solution

State: We will test \(H_0: p = 0.22\) versus \(H_a: p > 0.22\) at \(\alpha = 0.05\), where \(p\) is the true proportion of students at Karen’s school who use the app weekly.

Plan: One-sample z-test for proportion.
Conditions:
• Random: Simple random sample of 130 students (stated).
• 10% Condition: \(130 < 10\%\) of 2,000+ students (satisfied).
• Large Counts: \(np_0 = 130(0.22) = 28.6 \ge 10\) and \(n(1-p_0) = 130(0.78) = 101.4 \ge 10\) (satisfied).

Do:
Sample proportion \(\hat{p} = \frac{38}{130} \approx 0.2923\).
Test statistic \(z = \frac{\hat{p} – p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}} = \frac{0.2923 – 0.22}{\sqrt{\frac{0.22(0.78)}{130}}} \approx \frac{0.0723}{0.0363} \approx 1.99\).
P-value \(P(Z > 1.99) \approx 0.0233\).

Conclude:
Since the p-value (\(0.0233\)) is less than \(\alpha\) (\(0.05\)), we reject \(H_0\). [cite_start]There is convincing statistical evidence to support Karen’s belief that the proportion of students at her school who use the app is greater than \(0.22\) [cite: 379-384].

Question 5

According to a 2017 national survey in Country B, the mean number of bedrooms in newly built houses was 2.9. Rodney, a researcher, believes the mean number of bedrooms in newly built houses in the country was different in 2024 than it was in 2017. To investigate his belief, he took a large random sample of newly built houses in Country B in 2024 and recorded the number of bedrooms in each house. The distribution of the number of bedrooms for the sampled houses is summarized in the table.

Number of Bedrooms	1	2	3	4	5	6
Proportion of Houses	0.12	0.22	0.28	0.22	0.14	0.02

(A)
i. A house from the sample will be selected at random. What is the probability that the house had fewer than 3 bedrooms? Show your work.
ii. What is the mean number of bedrooms for the sample of newly built houses in 2024? Show your work.

(B) Rodney will use a one-sample t-test for a population mean to test his belief.
i. In the context of Rodney’s investigation, state the hypotheses for the test.
ii. Explain, in context, what a Type I error would be for Rodney’s hypothesis test.

(C) A different researcher, Keisha, suggests using a confidence interval to investigate whether the mean number of bedrooms in newly built houses in 2024 in Country B was different from 2.9. Assume the conditions for inference have been met. Using Rodney’s data, Keisha calculated a one-sample 97 percent confidence interval to estimate the population mean as (3.01, 3.19). Based on the confidence interval, what conclusion can be made for Rodney’s hypothesis test in part B at \(\alpha=0.03\)? Justify your answer.

Most-appropriate topic codes (CED):

• TOPIC 4.2: Probability Rules — part (a)
• TOPIC 7.2: One-Sample t-Test for Mean — part (b)
• TOPIC 7.4: Confidence Intervals for Mean — part (c)

▶️ Answer/Explanation

Concise solution

(A)(i)
\(P(\text{< 3}) = P(1) + P(2) = 0.12 + 0.22 = 0.34\).

(A)(ii)
Mean \(\bar{x} = \sum x_i p_i\)
\(= 1(0.12) + 2(0.22) + 3(0.28) + 4(0.22) + 5(0.14) + 6(0.02)\)
\(= 0.12 + 0.44 + 0.84 + 0.88 + 0.70 + 0.12 = 3.10\).

(B)(i)
\(H_0: \mu = 2.9\) (Mean is 2.9).
\(H_a: \mu \ne 2.9\) (Mean is different from 2.9).

(B)(ii)
Type I error: Concluding that the mean number of bedrooms in 2024 is different from 2.9 when, in reality, it is still 2.9.

(C)
Reject \(H_0\).
The 97% confidence interval is \((3.01, 3.19)\). Since the null value \(2.9\) is not included in the interval, there is convincing evidence at \(\alpha = 0.03\) (\(1 – 0.97\)) that the mean is different from 2.9.

Question 6

Stefan, a psychologist, conducted a study to investigate the effect of time of day on reading comprehension in children. One hundred children volunteered, with their parents’ consent, to participate in the study. Fifty of the children were randomly assigned to read a story at 9 a.m. and then answer 25 questions about it. The remaining 50 children were assigned to read the same story at 3 p.m. and answer the same 25 questions. The reading comprehension for each child was measured by a reading score, which was determined by the number of questions that were answered correctly about the story. Stefan is interested in comparing the mean reading scores for the two times of day. Table 1 shows the results of Stefan’s study.

Table 1: Summary Statistics of Reading Scores

	n	Mean	Standard Deviation
9 a.m.	50	15.2	4.12
3 p.m.	50	17.9	4.43

Stefan found the conditions for inference were met and conducted a two-sample t-test for the difference in two population means. Let \(\mu_{AM}\) represent the mean reading score for all children, similar to those in the study, who would read the story at 9 a.m. Let \(\mu_{PM}\) represent the mean reading score for all children, similar to those in the study, who would read the story at 3 p.m. Stefan’s hypotheses are as shown:
\(H_0: \mu_{AM} = \mu_{PM}\)
\(H_a: \mu_{AM} \ne \mu_{PM}\)

(A) The p-value for Stefan’s hypothesis test was 0.002. State an appropriate conclusion, at the 5 percent significance level, for Stefan’s test in the context of the investigation. Justify your answer.

(B) Explain why it was appropriate for Stefan to conduct a two-sample t-test for the difference in two population means instead of a paired t-test for the population mean difference.

(C) Researchers are usually interested in the practical importance of their results as well as the statistical significance of the hypothesis test. The practical importance of the results indicates whether the observed results are meaningful in real life… One indicator of practical importance is effect size. A common method for measuring effect size for the difference in two group means is Cohen’s d coefficient… calculated using \(d = \frac{|\bar{x}_1 – \bar{x}_2|}{s_p}\). When the sizes of the groups are equal, \(s_p\) is calculated as \(s_p = \sqrt{\frac{s_1^2 + s_2^2}{2}}\)…

i. Calculate Cohen’s d coefficient for Stefan’s study. Show your work.
ii. Higher values of Cohen’s d indicate greater practical importance and lower values of Cohen’s d indicate less practical importance. Typically, we use the intervals listed in Table 2 to help interpret practical importance.

Table 2: Guidelines for Interpreting Cohen’s d Coefficient

Cohen’s d Coefficient	Practical Importance
\(0 \le d \le 0.20\)	Not very meaningful in real life
\(0.20 < d < 0.80\)	Somewhat meaningful in real life
\(d \ge 0.80\)	Very meaningful in real life

Based on your answer to part C (i) and the information in Tables 1 and 2, describe the practical importance of Stefan’s results, in context.

(D) Suppose the results of Stefan’s study, summarized in Table 1, instead had a standard deviation for the 9 a.m. reading scores, \(s_1\), and a standard deviation for the 3 p.m. reading scores, \(s_2\), that were both greater than 4.43. Assume the group sample sizes and the means are not changed.

i. Would the Cohen’s d coefficient in this new situation be smaller than, larger than, or the same as the Cohen’s d coefficient calculated in part C (i)? Explain your answer.
ii. Does the Cohen’s d coefficient described in part D (i) indicate that Stefan’s observed difference in the means in the new situation would have more practical importance than, less practical importance than, or the same practical importance as what was originally determined in part C (ii)? Explain your answer.

Most-appropriate topic codes (CED):

• TOPIC 7.6: Two-Sample t-Test for Means — part (a), (b)
• TOPIC 1.1: Analyzing Quantitative Data — part (c), (d)

▶️ Answer/Explanation

Concise solution

(A)
Reject \(H_0\).
The p-value \((0.002) < \alpha (0.05)\). There is convincing evidence that the mean reading score for children reading at 9 a.m. is different from the mean score for children reading at 3 p.m.

(B)
Independent groups.
A two-sample t-test is used because the data comes from two independent groups of children (randomly assigned). A paired t-test requires matched pairs or the same subject measured twice.

(C)(i)
\(s_p = \sqrt{\frac{4.12^2 + 4.43^2}{2}} \approx \sqrt{18.297} \approx 4.278\).
\(d = \frac{|15.2 – 17.9|}{4.278} = \frac{2.7}{4.278} \approx 0.63\).

(C)(ii)
Somewhat meaningful.
Since \(d \approx 0.63\) is between \(0.20\) and \(0.80\), the result is considered somewhat meaningful in real life.

(D)(i)
Smaller.
Increasing \(s_1\) and \(s_2\) increases the denominator \((s_p)\). With the same numerator (difference in means), a larger denominator results in a smaller \(d\).

(D)(ii)
Less practical importance.
A smaller \(d\) moves closer to 0, indicating less practical importance according to Table 2.

Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Resources

Members

Company