Sample Size for Power Calculator

Math Probability • Statistical Inference and Hypothesis Testing

Written by STEM Calculators Team Published February 14, 2026 Updated February 24, 2026

Plan a study by estimating the minimum sample size needed to reach a target power \((1-\beta)\) for one-sample or two-sample mean tests, using Cohen’s \(d\), \(\alpha\), and tail choice.

Effect size (Cohen’s d): \(d=\dfrac{|\mu_1-\mu_0|}{\sigma}\) (one-sample) or \(d=\dfrac{|\mu_1-\mu_2|}{\sigma}\) (two-sample, pooled \(\sigma\)).
This tool uses standard planning approximations and reports achieved power at the returned integer \(n\).

Design Test type Alternative Significance \(\alpha\) Target power \((1-\beta)\) Effect size (Cohen’s \(d\)) Search limit (max \(n\)) Animation speed

Ready

Power curve vs. sample size (animated)

The plot shows achieved power as \(n\) increases. The vertical line marks the smallest \(n\) reaching the target.

Enter values and click “Calculate”.

Rate this calculator

0.0 /5 (0 ratings)

Be the first to rate.

Your rating

Name (optional) Review (optional)

You can update your rating any time.

Sample size for power (hypothesis tests)

In study planning, you often want enough data to reliably detect a meaningful effect. A power analysis answers: “What sample size \(n\) do I need so my test detects an effect of size \(d\) with probability (power) at least \(1-\beta\), while controlling false positives at level \(\alpha\)?”

1) Key concepts: \(\alpha\), \(\beta\), and power

\(\alpha\) (Type I error): probability of rejecting \(H_0\) when \(H_0\) is true (false positive).
\(\beta\) (Type II error): probability of failing to reject \(H_0\) when the alternative is true (false negative).
Power: \(1-\beta\), the probability the test detects the effect (rejects \(H_0\)) when the effect is present.

\[ \text{Power} = P(\text{Reject }H_0 \mid H_1\text{ true}) = 1-\beta. \]

2) Effect size for means: Cohen’s \(d\)

For mean-based tests, a common standardized effect size is Cohen’s \(d\), which measures the mean difference in standard deviation units:

\[ \text{One-sample:}\quad d=\frac{|\mu_1-\mu_0|}{\sigma}, \qquad \text{Two-sample:}\quad d=\frac{|\mu_1-\mu_2|}{\sigma}. \]

Interpreting \(d\) depends on context, but a common rule of thumb is: small \(\approx 0.2\), medium \(\approx 0.5\), large \(\approx 0.8\).

3) How sample size affects power

Increasing \(n\) reduces standard error, making the test statistic “separate” more under the alternative. In planning, we often use a normal approximation in which the test statistic under \(H_1\) behaves like:

\[ T \approx N(\delta,1), \]

where \(\delta\) is the noncentrality (mean shift in standard-error units). For equal-variance designs:

\[ \delta(n)= \begin{cases} d\sqrt{n}, & \text{one-sample mean}\\[6pt] d\sqrt{\dfrac{n}{2}}, & \text{two-sample means (equal }n\text{ per group)} \end{cases} \]

4) Critical values: one-sided vs two-sided

The significance level \(\alpha\) determines a rejection threshold \(c\):

\[ c= \begin{cases} z_{1-\alpha/2}, & \text{two-sided test}\\[6pt] z_{1-\alpha}, & \text{one-sided test} \end{cases} \]

Here \(z_q\) means the \(q\)-quantile of the standard normal distribution. For a t-test, the critical value is often a \(t\)-quantile with degrees of freedom, typically: \(df=n-1\) (one-sample) or \(df=2n-2\) (two-sample, equal group sizes).

5) Power formulas (normal approximation)

Using \(T \approx N(\delta,1)\), power can be written in closed form using the standard normal CDF \(\Phi(\cdot)\).

Two-sided

\[ \mathrm{Power}(n) =P(|N(\delta,1)|>c) =\bigl(1-\Phi(c-\delta)\bigr)+\Phi(-c-\delta). \]

One-sided (right-tailed)

\[ \mathrm{Power}(n)=P(N(\delta,1)>c)=1-\Phi(c-\delta). \]

One-sided (left-tailed)

\[ \mathrm{Power}(n)=P(N(\delta,1)<-c)=\Phi(-c-\delta). \]

In practice, two-sided tests are common unless you have a strong reason to justify a one-sided direction before data collection.

6) Solving for the required \(n\)

A classic planning shortcut uses the idea that you need \(\delta\) to exceed the critical value by an amount linked to the target power:

\[ \delta \approx z_{1-\alpha^\*}+z_{1-\beta}, \qquad \alpha^\*= \begin{cases} \alpha/2, & \text{two-sided}\\ \alpha, & \text{one-sided} \end{cases} \]

Since \(1-\beta\) is the target power, we have \(z_{1-\beta}=z_{\text{power}}\). Substituting \(\delta(n)\) gives a closed-form estimate for \(n\):

\[ n \approx \begin{cases} \left(\dfrac{z_{1-\alpha^\*}+z_{\text{power}}}{d}\right)^2, & \text{one-sample}\\[10pt] 2\left(\dfrac{z_{1-\alpha^\*}+z_{\text{power}}}{d}\right)^2, & \text{two-sample (per group)} \end{cases} \]

Because real studies need an integer \(n\), calculators typically round up and then re-check the achieved power. This tool does exactly that: it finds the smallest integer \(n\) (up to a search limit) such that power\(\ge\)target.

7) Worked example (the 64-per-group rule-of-thumb)

Suppose you plan a two-sample comparison of means with: \(\alpha=0.05\), target power \(=0.80\), and \(d=0.5\) (a medium effect).

\[ z_{1-\alpha/2}=z_{0.975}\approx 1.96,\qquad z_{\text{power}}=z_{0.80}\approx 0.84. \]

\[ n \approx 2\left(\frac{1.96+0.84}{0.5}\right)^2 =2\left(\frac{2.80}{0.5}\right)^2 =2(5.6)^2 =2(31.36) \approx 62.7 \Rightarrow \text{round up } n\approx 63 \text{ or }64 \text{ per group.} \]

Different conventions (continuity corrections, exact power, t vs z) can shift the result slightly, which is why the calculator re-checks the achieved power at the rounded integer \(n\).

8) Practical notes and limitations

Assumptions: independent observations, approximately normal errors (or large \(n\) via CLT), and equal variance for two-sample pooling.
Dropout / missing data: increase planned \(n\) to compensate (e.g., \(n_{\text{planned}} \approx n/(1-\text{dropout rate})\)).
Multiple testing: if you test many endpoints, effective \(\alpha\) can be smaller, increasing required \(n\).
t-test power: exact t-test power uses a noncentral t distribution; many planning tools use a close normal approximation (especially for moderate/large \(n\)).

9) University extensions

Beyond mean tests, power analysis extends to:

Proportions (risk differences, odds ratios) using normal/score/Wald approximations.
ANOVA/regression with \(R^2\), \(f^2\), or noncentral F distributions.
Sequential designs and group-sequential testing (alpha spending).

Power curve vs. sample size (animated)

Rate this calculator

Related calculators