AP Statistics 9.4 Setting Up a Test for the Slope of a Regression Model Study Notes
AP Statistics 9.4 Setting Up a Test for the Slope of a Regression Model Study Notes- New syllabus
AP Statistics 9.4 Setting Up a Test for the Slope of a Regression Model Study Notes -As per latest AP Statistics Syllabus.
LEARNING OBJECTIVE
- The t-distribution may be used to model variation.
Key Concepts:
- Selecting an Appropriate Testing Method for the Slope (\(\beta\)) of a Regression Model
- Null and Alternative Hypotheses for the Slope (\(\beta\)) of a Regression Model
- Verifying Conditions for a Significance Test for the Slope (\(\beta\))
Selecting an Appropriate Testing Method for the Slope (\(\beta\)) of a Regression Model
Selecting an Appropriate Testing Method for the Slope (\(\beta\)) of a Regression Model
To determine whether there is a statistically significant linear relationship between an explanatory variable (\(x\)) and a response variable (\(y\)), we test hypotheses about the population slope \(\beta\) using a t-test for the slope.
Test statistic:
The t-statistic for testing the slope is calculated as:
\( t = \dfrac{b – \beta_0}{SE_b} \)
- \(b\) = sample slope from regression line \(\hat{y} = a + bx\)
- \(\beta_0\) = hypothesized population slope under \(H_0\) (often 0)
- \(SE_b\) = standard error of the slope
Distribution:
- The test statistic follows a t-distribution with \(df = n – 2\) under the null hypothesis.
Interpretation: The t-test allows us to assess whether the observed sample slope provides statistically significant evidence of a linear relationship in the population.
Example
A researcher wants to determine whether hours of weekly exercise (\(x\)) affect resting heart rate (\(y\)) in adults. A sample of 15 adults yields a regression line \(\hat{y} = 80 – 1.8x\) with a standard error of the slope \(SE_b = 0.6\).
Which of the following is the correct t-test statistic for testing \(H_0: \beta = 0\) against \(H_a: \beta \ne 0\)?
- \( t = \dfrac{-1.8}{0.6} \approx -3.0 \)
- \( t = \dfrac{80}{0.6} \approx 133.3 \)
- \( t = \dfrac{-1.8}{15} \approx -0.12 \)
- \( t = \dfrac{0.6}{-1.8} \approx -0.33 \)
▶️ Answer / Explanation
Step 1 — Identify the formula for the t-test for slope:
\( t = \dfrac{b – \beta_0}{SE_b} \)
Step 2 — Plug in the values:
\( t = \dfrac{-1.8 – 0}{0.6} = \dfrac{-1.8}{0.6} \approx -3.0 \)
Step 3 — Conclusion:
The correct t-test statistic is approximately \(-3.0\), so the correct answer is A.
Null and Alternative Hypotheses for the Slope (\(\beta\)) of a Regression Model
Null and Alternative Hypotheses for the Slope (\(\beta\)) of a Regression Model
In regression analysis, we often want to test whether there is a significant linear relationship between an explanatory variable (\(x\)) and a response variable (\(y\)). This is done by testing hypotheses about the population slope \(\beta\).
Hypotheses:
- Null hypothesis: \(H_0: \beta = 0\) (There is no linear relationship between \(x\) and \(y\) in the population.)
- Alternative hypothesis: \(H_a: \beta \ne 0\) (two-sided), or \(H_a: \beta > 0\) / \(H_a: \beta < 0\) (one-sided, depending on context) (There is a linear relationship between \(x\) and \(y\) in the population.)
Notes:
- Choice between one-sided or two-sided depends on the research question.
- If the confidence interval for the slope does not include 0, it provides evidence against the null hypothesis.
Example
A study investigates whether weekly hours of study (\(x\)) affect exam scores (\(y\)) in college students. A researcher wants to test if there is a significant positive relationship.
Which set of hypotheses correctly represents this test?
- \(H_0: \beta = 0\), \(H_a: \beta \ne 0\)
- \(H_0: \beta = 0\), \(H_a: \beta > 0\)
- \(H_0: \beta = 1\), \(H_a: \beta > 1\)
- \(H_0: \beta \ne 0\), \(H_a: \beta = 0\)
▶️ Answer / Explanation
Step 1 — Identify the research question:
The researcher wants to test for a positive relationship (one-sided).
Step 2 — Set hypotheses:
- Null hypothesis: \(H_0: \beta = 0\) (no linear relationship)
- Alternative hypothesis: \(H_a: \beta > 0\) (positive linear relationship)
Step 3 — Conclusion:
The correct answer is B.
Verifying Conditions for a Significance Test for the Slope (\(\beta\))
Verifying Conditions for a Significance Test for the Slope (\(\beta\))
Before performing a t-test for the slope of a regression model, it is essential to ensure that the data meet the conditions required for valid statistical inference.
Conditions:
Linearity: The relationship between the explanatory variable (\(x\)) and the response variable (\(y\)) should be approximately linear.
- Check using scatter plots or residual plots to ensure no systematic pattern remains in the residuals.
Independence: Observations should be independent of each other.
- Data should come from a random sample or a randomized experiment.
- If sampling without replacement from a finite population, check the 10% condition: \( n \leq 0.1 N \), where \(n\) = sample size and \(N\) = population size.
Normality of residuals: The residuals (differences between observed and predicted \(y\)) should be approximately normally distributed.
- Check with a histogram, boxplot, or normal probability plot of the residuals.
Equal variance (Homoscedasticity): The residuals should have constant variance across all levels of \(x\).
- Residual plot should show no clear pattern or funnel shape.
Notes:
- Meeting these conditions ensures that the t-test for the slope produces valid inference for the population slope \(\beta\).
- If the conditions are not met, consider data transformation or a different model to obtain reliable results.
Example
A study examines the effect of weekly study hours (\(x\)) on exam scores (\(y\)) for 12 students. The sample regression line is \(\hat{y} = 70 + 2.5x\) with a residual standard deviation \(s = 3.0\) and \(\sum (x_i – \bar{x})^2 = 50\).
Conduct a significance test for \(H_0: \beta = 0\) vs \(H_a: \beta \ne 0\) at \(\alpha = 0.05\).
▶️ Answer / Explanation
Step 1 — Verify conditions:
All four conditions (linearity, independence, normality, constant variance) are satisfied. Proceed with the t-test.
Step 2 — Compute standard error of slope:
\( SE_b = \dfrac{s}{\sqrt{\sum (x_i – \bar{x})^2}} = \dfrac{3.0}{\sqrt{50}} \approx \dfrac{3.0}{7.071} \approx 0.424 \)
Step 3 — Compute t-statistic:
\( t = \dfrac{b – \beta_0}{SE_b} = \dfrac{2.5 – 0}{0.424} \approx 5.89 \)
Step 4 — Determine degrees of freedom:
\( df = n-2 = 12-2 = 10 \)
Step 5 — Find p-value:
Using t-distribution table or software, \( t = 5.89 \) with \( df = 10 \) gives \( p < 0.001 \)
Step 6 — Conclusion:
Since \( p < 0.05 \), we reject \(H_0\). There is strong evidence of a significant positive linear relationship between study hours and exam scores in the population.