Confidence intervals: estimate ± uncertainty
A confidence interval (CI) gives a plausible range for an unknown parameter (like a population mean \(\mu\) or proportion \(p\)),
based on a sample estimate and an estimate of sampling uncertainty. The most common CI template is:
\[
\boxed{\text{CI} = \hat\theta \pm (\text{critical})\cdot SE}
\]
Here \(\hat\theta\) is your point estimate (e.g., \(\bar X\) or \(\hat p\)), \(SE\) is the standard error, and the critical value
depends on the desired confidence level and the reference distribution (z or t).
1) Confidence level and \(\alpha\)
A two-sided \((1-\alpha)\) confidence interval leaves probability \(\alpha/2\) in each tail:
\[
1-\alpha = \text{confidence level}, \qquad \alpha = 1-(1-\alpha).
\]
2) z-intervals (normal critical values)
If the sampling distribution is approximately normal and \(SE\) is known (or reliably estimated), a z-interval uses:
\[
\text{crit} = z_{1-\alpha/2}.
\]
Common values:
3) t-intervals (unknown \(\sigma\), small samples)
When the population standard deviation \(\sigma\) is unknown and \(n\) is not large, the sampling distribution of the standardized
mean uses a Student t distribution:
\[
\text{crit} = t_{1-\alpha/2,\;df},
\qquad df \approx n-1.
\]
Compared to z, t critical values are slightly larger for small \(df\), reflecting extra uncertainty from estimating \(\sigma\).
As \(df\to\infty\), \(t\) critical values approach z critical values.
4) Margin of error
The margin of error (ME) is the “plus/minus” amount:
\[
ME = (\text{critical})\cdot SE.
\]
Larger confidence \(\Rightarrow\) larger critical \(\Rightarrow\) wider interval. Larger sample size usually reduces \(SE\) and shrinks the interval.
5) Interpreting a confidence interval correctly
A common misconception is: “There is a 95% probability the true parameter is in this particular interval.”
The parameter is fixed; the interval is random. The correct interpretation is:
-
If we repeated the sampling procedure many times and computed a 95% CI each time, then about 95% of those intervals
would contain the true parameter.
6) Proportions and polling: why “margin of error” is famous
In polls, \(\hat p\) is the reported proportion and ME is often stated explicitly:
\[
\hat p \pm z_{1-\alpha/2}\sqrt{\frac{\hat p(1-\hat p)}{n}}.
\]
For proportions near 0 or 1 (or small \(n\)), “Wald” intervals can be inaccurate; Wilson or Agresti–Coull intervals often perform better.
7) University tease: bootstrap confidence intervals
Bootstrap methods approximate uncertainty by resampling from the observed data (with replacement) and recomputing the statistic many times.
A simple bootstrap percentile CI takes quantiles of the resampled statistic distribution.