Confidence intervals for regression coefficients (simple linear regression)
In simple linear regression, we model the relationship between a response \(Y\) and a predictor \(X\) using:
\[
Y = \beta_0 + \beta_1 X + \varepsilon,
\qquad \mathbb{E}[\varepsilon]=0,\quad \mathrm{Var}(\varepsilon)=\sigma^2.
\]
Given data \((x_i,y_i)\) for \(i=1,\dots,n\), we estimate the coefficients \(\beta_0,\beta_1\) using least squares and then build a
confidence interval (CI) using a t critical value.
1) Least squares estimates
Compute sample means:
\[
\bar x=\frac{1}{n}\sum_{i=1}^n x_i,
\qquad
\bar y=\frac{1}{n}\sum_{i=1}^n y_i.
\]
Define sums of squares and cross-products:
\[
S_{xx}=\sum_{i=1}^n (x_i-\bar x)^2,
\qquad
S_{xy}=\sum_{i=1}^n (x_i-\bar x)(y_i-\bar y),
\qquad
S_{yy}=\sum_{i=1}^n (y_i-\bar y)^2.
\]
The least squares slope and intercept are:
\[
b_1=\frac{S_{xy}}{S_{xx}},
\qquad
b_0=\bar y-b_1\bar x.
\]
2) Residuals, SSE, and the residual standard deviation
The fitted value at \(x_i\) is \(\hat y_i=b_0+b_1x_i\). Residuals are \(e_i=y_i-\hat y_i\).
The sum of squared errors is:
\[
SSE=\sum_{i=1}^n (y_i-\hat y_i)^2.
\]
An unbiased estimator of \(\sigma^2\) is:
\[
s^2=\frac{SSE}{n-2},\qquad s=\sqrt{s^2}.
\]
3) Standard errors of \(b_1\) and \(b_0\)
Under the usual regression assumptions (independent errors, constant variance, and normality for exact t results),
the coefficient estimates have standard errors:
\[
SE(b_1)=\sqrt{\frac{s^2}{S_{xx}}}.
\]
\[
SE(b_0)=\sqrt{s^2\left(\frac{1}{n}+\frac{\bar x^2}{S_{xx}}\right)}.
\]
4) Confidence intervals
Let \(df=n-2\). A two-sided \((1-\alpha)\) CI has the form:
\[
b \pm t_{1-\alpha/2,\;df}\,SE(b),
\]
where \(t_{1-\alpha/2,df}\) is the t critical value (quantile) from a Student t distribution with \(df\) degrees of freedom.
One-sided bounds use \(t_{1-\alpha,df}\):
\[
\text{Lower one-sided: } b - t_{1-\alpha,df}SE(b),
\qquad
\text{Upper one-sided: } b + t_{1-\alpha,df}SE(b).
\]
5) Testing whether the slope is zero
The common hypothesis test for trend uses:
\[
H_0:\beta_1=0
\qquad\text{vs}\qquad
H_1:\beta_1\ne 0.
\]
The t statistic is:
\[
t=\frac{b_1-0}{SE(b_1)}.
\]
Under \(H_0\), \(t\) follows a Student t distribution with \(df=n-2\). A small p-value suggests a statistically significant nonzero slope.
6) \(R^2\): goodness of fit
The coefficient of determination measures how much variation in \(Y\) is explained by the linear model:
\[
R^2 = 1-\frac{SSE}{S_{yy}}.
\]
\(R^2\) near 1 indicates the line explains most of the variability in \(y\); values near 0 indicate a weak linear relationship.
7) Confidence bands vs prediction intervals (important distinction)
Many plots show a “band” around the fitted line. There are two different concepts:
- Confidence band for the mean response \(E[Y\mid X=x]\): uncertainty in the regression line itself.
- Prediction interval for a new observation \(Y_{\text{new}}\): wider, because it includes both line uncertainty and noise \(\varepsilon\).
This calculator draws a standard confidence band for the mean response:
\[
\hat y(x)\pm t_{1-\alpha/2,df}\; s\sqrt{\frac{1}{n}+\frac{(x-\bar x)^2}{S_{xx}}}.
\]
8) University extension: multiple regression
With multiple predictors, coefficients are estimated using matrices, and standard errors come from \((X^\top X)^{-1}\).
The CI structure remains the same: estimate ± t × SE, but computations use the full design matrix.