Home / AP Statistics 2.6 Linear Regression Models Study Notes

AP Statistics 2.6 Linear Regression Models Study Notes

AP Statistics 2.6 Linear Regression Models Study Notes- New syllabus

AP Statistics 2.6 Linear Regression Models Study Notes -As per latest AP Statistics Syllabus.

LEARNING OBJECTIVE

  • Regression models may allow us to predict responses to changes in an explanatory variable.

Key Concepts:

  • Calculate a Predicted Response Using a Linear Regression Model

AP Statistics -Concise Summary Notes- All Topics

Calculate a Predicted Response Using a Linear Regression Model

Calculate a Predicted Response Using a Linear Regression Model

For two quantitative variables \(x\) (explanatory) and \(y\) (response), the least-squares regression line predicts \(y\) from \(x\) with the equation

\( \displaystyle \hat{y} = a + b x \)

where the slope \(b\) and intercept \(a\) can be computed from summary statistics:

\( \displaystyle b = r\cdot \dfrac{s_y}{s_x}, \qquad a = \bar{y} – b\bar{x} \)

Here \(r\) is the sample correlation, \(s_x,s_y\) are the sample standard deviations, and \(\bar{x},\bar{y}\) are sample means.

How to predict

  1. Compute \(b\) using \( b = r\,(s_y/s_x)\) (or obtain \(b\) directly from regression output).
  2. Compute \(a = \bar{y} – b\bar{x}\).
  3. Substitute the desired \(x\) value into \(\hat{y} = a + b x\) to get the predicted response.
  4. Optionally compute the residual \( e = y_{\text{obs}} – \hat{y} \) if an observed \(y\) exists.

Interpretation

  • Slope \(b\): predicted change in \(y\) for a one–unit increase in \(x\). (Units matter.)
  • Intercept \(a\): predicted \(y\) when \(x=0\). Interpret with caution—may be meaningless if \(x=0\) is outside the data range.
  • Residual: \( e = y – \hat{y} \). Positive residual → observed above prediction; negative → below.
  • Caution: Do not extrapolate predictions far beyond the range of observed \(x\) values.

 Example

Data for 5 students (hours studied \(x\), exam score \(y\)):

\( (2,65), (4,70), (5,75), (7,85), (9,95) \).

 Use the sample summaries to find the least-squares line and predict the exam score for a student who studies 6 hours. Also compute the residual if a student who studied 7 hours scored 85.

▶️ Answer / Explanation

Step 1 — compute summary statistics (from the data)

\( \bar{x} = 5.4,\quad \bar{y} = 78.0. \)

Sample standard deviations: \( s_x \approx 2.70,\quad s_y \approx 12.04. \)

Correlation: \( r \approx 0.991 \) (very strong positive linear association).

Step 2 — compute slope and intercept

\( b = r\cdot\dfrac{s_y}{s_x} \approx 0.991\cdot\dfrac{12.04}{2.70} \approx 4.42. \)

\( a = \bar{y} – b\bar{x} \approx 78.0 – 4.42(5.4) \approx 54.15. \)

Regression equation (least-squares line):

\( \displaystyle \hat{y} = 54.15 + 4.42x \).

Step 3 — predict for \(x=6\)

\( \hat{y}(6) = 54.15 + 4.42(6) = 54.15 + 26.52 \approx 80.67. \)

Answer: Predicted exam score ≈ 80.7 for a student who studies 6 hours.

Step 4 — residual for the 7-hour student who scored 85

\( \hat{y}(7) = 54.15 + 4.42(7) \approx 85.09. \)

Residual \( e = y_{\text{obs}} – \hat{y} = 85 – 85.09 \approx -0.09 \) (observed is about 0.09 points below prediction).

Interpretation: The slope \(4.42\) means we predict about a 4.42-point increase in exam score for each additional hour studied. The intercept ~54.15 estimates the predicted score at \(x=0\) hours (interpret with caution). The residual near zero for the 7-hour student indicates the model predicted that score well.

Notes & cautions:

  • Because the regression was computed from this sample, predicted values have sampling variability.
  • Predictions far outside the observed \(x\)-range (here 2–9 hours) are extrapolations and may be unreliable.
  • Check residuals and plot to ensure a linear model is appropriate before trusting predictions.

Example 

Data summary: A real estate sample shows house size (\(x\), in 1000 sq. ft.) vs price (\(y\), in $1000). Regression output from calculator:

\( a = 50.2,\; b = 120.5,\; r = 0.93. \)

Predict the price of a 2.5-thousand sq. ft. house. How do you get this result quickly on a TI-84?

▶️ Answer / Explanation

Step 1 — TI-84 Regression:

  • Enter sizes in L1, prices in L2.
  • STAT → CALC → 8:LinReg(ax+b).
  • Result: \( a=50.2,\; b=120.5 \).

Step 2 — Prediction:

Equation: \( \hat{y} = 50.2 + 120.5x \).

For \(x=2.5\): \( \hat{y} = 50.2 + 120.5(2.5) = 50.2 + 301.25 = 351.45 \).

Predicted price: ≈ \$351,450.

Step 3 — TI-84 One-line recipe:

After running LinReg(ax+b): type Y1 = a + bX in Y= menu (press VARS → Statistics → EQ → RegEq). Then use 2nd → CALC → value at \(x=2.5\) to get \(\hat{y}\).

Interpretation: Each additional 1000 sq. ft. adds about \$120,500 to the predicted price. A 2.5-thousand sq. ft. house is predicted at ≈ \$351,450.

Scroll to Top