Confidence interval formula
The confidence interval formula expresses an interval estimate for an unknown population parameter by combining a point estimate with a margin of error: \( \text{estimate} \pm \text{(critical value)} \cdot \text{(standard error)} \). The critical value comes from a sampling distribution (commonly the standard normal \(z\) or Student’s \(t\)), and the standard error quantifies sampling variability.
General template. \[ \text{Confidence interval} = \text{point estimate} \pm E, \qquad E = \text{critical value} \cdot \text{standard error}. \]
Independent random sampling and a valid sampling-distribution approximation (exact normality, a large-sample approximation, or a justified \(t\)-model) are standard working conditions.
Symbols and interpretation
- Confidence level
- \(1 - \alpha\), where \(\alpha\) is the total tail probability outside the central region used to set the critical value.
- Critical values
- \(z_{\alpha/2}\) satisfies \(P(Z \le z_{\alpha/2}) = 1 - \alpha/2\) for \(Z \sim N(0,1)\). \(t_{\alpha/2,\nu}\) satisfies \(P(T \le t_{\alpha/2,\nu}) = 1 - \alpha/2\) for \(T \sim t_{\nu}\) with degrees of freedom \(\nu\).
- Standard error
- A standard deviation for the estimator (e.g., \(\sigma/\sqrt{n}\), \(s/\sqrt{n}\), or \(\sqrt{\hat{p}(1-\hat{p})/n}\)).
- Margin of error
- \(E\), the half-width of the interval, equals (critical value) \(\cdot\) (standard error).
Common confidence interval formulas
| Parameter | Point estimate | Standard error | Critical value | Confidence interval formula | Typical conditions |
|---|---|---|---|---|---|
| Population mean \(\mu\) (σ known) | \(\bar{x}\) | \(\sigma/\sqrt{n}\) | \(z_{\alpha/2}\) | \(\bar{x} \pm z_{\alpha/2} \cdot \dfrac{\sigma}{\sqrt{n}}\) | Random sample; population normal or \(n\) large; \(\sigma\) known. |
| Population mean \(\mu\) (σ unknown) | \(\bar{x}\) | \(s/\sqrt{n}\) | \(t_{\alpha/2,\,n-1}\) | \(\bar{x} \pm t_{\alpha/2,\,n-1} \cdot \dfrac{s}{\sqrt{n}}\) | Random sample; population approximately normal (especially for small \(n\)); \(s\) estimates \(\sigma\). |
| Population proportion \(p\) | \(\hat{p}\) | \(\sqrt{\hat{p}(1-\hat{p})/n}\) | \(z_{\alpha/2}\) | \(\hat{p} \pm z_{\alpha/2} \cdot \sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}\) | Independent trials; large-sample normal approximation (e.g., \(n\hat{p}\) and \(n(1-\hat{p})\) sufficiently large). |
Visualization of the critical value in the confidence interval formula
Margin of error and interval width
A two-sided interval has half-width \(E\), so the full width equals \(2E\). Increasing the confidence level \(1-\alpha\) increases the critical value and widens the interval. Increasing the sample size \(n\) decreases the standard error (often proportional to \(1/\sqrt{n}\)) and narrows the interval.
Sample size planning from the confidence interval formula
When a target margin of error \(E\) is specified in advance, the confidence interval formula leads to standard sample-size relations. Algebraic rearrangement yields the following planning formulas.
| Goal | Planning relation | Notes |
|---|---|---|
| Mean \(\mu\) (σ known) | \( n = \left(\dfrac{z_{\alpha/2} \cdot \sigma}{E}\right)^2 \) | \(n\) is rounded up to the next integer; valid when a reliable \(\sigma\) is available. |
| Proportion \(p\) | \( n = \dfrac{z_{\alpha/2}^{2} \cdot p^{\ast}(1-p^{\ast})}{E^{2}} \) | \(p^{\ast}\) is a planning value; the conservative choice \(p^{\ast}=0.5\) maximizes \(p^{\ast}(1-p^{\ast})\) and yields the largest required \(n\). |
Numerical example using the confidence interval formula
A random sample yields \(\bar{x} = 12.4\), with known \(\sigma = 3.0\), and \(n = 36\). A 95% confidence level corresponds to \(\alpha = 0.05\) and \(z_{\alpha/2} \approx 1.96\).
\[ \text{SE} = \dfrac{\sigma}{\sqrt{n}} = \dfrac{3.0}{\sqrt{36}} = \dfrac{3.0}{6} = 0.5 \]
\[ E = z_{\alpha/2} \cdot \text{SE} = 1.96 \cdot 0.5 = 0.98 \]
\[ \bar{x} \pm E = 12.4 \pm 0.98 \;\;\Rightarrow\;\; (11.42,\; 13.38) \]
Common pitfalls
A confidence interval formula estimates a population parameter (such as \(\mu\) or \(p\)), not an individual future observation. A 95% confidence level describes long-run performance under repeated sampling, not a 95% probability that a fixed parameter lies in a single computed interval. Using \(z\) in place of \(t\) for a mean when \(\sigma\) is unknown and \(n\) is small produces intervals that are systematically too narrow.
Summary
The confidence interval formula has the universal structure (estimate) \(\pm\) (critical value) \(\cdot\) (standard error). The substantive statistical work lies in selecting the correct standard error model and the correct critical value distribution so that the stated confidence level is justified by the sampling conditions.