Loading…

What Does R Squared Mean in Statistics (Coefficient of Determination)

What does r squared meanl in statistics, and how is R² interpreted in a simple linear regression model?

Subject: Statistics Chapter: Simple Linear Regression Topic: Coefficient of Determination Answer included
what does r squared meanl what does r squared mean r squared meaning R squared coefficient of determination simple linear regression explained variance
Accepted answer Answer included

What does R2 mean in statistics?

The phrase what does r squared meanl is treated as a misspelling of “what does R squared mean.” In statistics, R2 is the coefficient of determination, a summary of how much variation in a response variable is explained by a regression model relative to an intercept-only baseline.

Core meaning in regression output

In simple linear regression (and more generally, least-squares regression with an intercept), R2 is interpreted as the proportion of total variability in the observed response values that is accounted for by the fitted model. Values closer to 1 indicate the model explains a larger share of the variation; values closer to 0 indicate the model explains little beyond the mean of the response.

R2 is commonly reported as a percentage. For example, R2 = 0.83 corresponds to about 83% of the variation in the response being explained by the model, with the remaining variation attributed to residual error and unmodeled factors.

Variance decomposition definition

Let \(y_1,\dots,y_n\) be observed responses, \(\hat{y}_1,\dots,\hat{y}_n\) the fitted (predicted) responses from a least-squares line, and \(\bar{y}\) the sample mean: \[ \bar{y}=\frac{1}{n}\sum_{i=1}^{n}y_i. \] Three sums of squares formalize “total variation,” “explained variation,” and “unexplained variation”: \[ \mathrm{SST}=\sum_{i=1}^{n}(y_i-\bar{y})^2,\qquad \mathrm{SSR}=\sum_{i=1}^{n}(\hat{y}_i-\bar{y})^2,\qquad \mathrm{SSE}=\sum_{i=1}^{n}(y_i-\hat{y}_i)^2. \] With an intercept in the model, the decomposition \[ \mathrm{SST}=\mathrm{SSR}+\mathrm{SSE} \] holds, and the coefficient of determination is \[ R^2=\frac{\mathrm{SSR}}{\mathrm{SST}}=1-\frac{\mathrm{SSE}}{\mathrm{SST}}. \]

Connection to correlation in simple linear regression

In simple linear regression with an intercept, R2 equals the square of the Pearson correlation coefficient \(r\) between \(x\) and \(y\): \[ R^2=r^2. \] The sign of the linear association is carried by \(r\) (positive or negative slope), while R2 is nonnegative and measures strength of fit in the least-squares sense.

Interpretation boundaries and common pitfalls

Interpretation statement Statistical meaning Clarifying note
“R² is the percent of variability explained.” R² is a proportion of \(\mathrm{SST}\) explained by the model’s fitted values. Meaning depends on an intercept model and the chosen response variable; it is not a statement about individual prediction errors.
“High R² means accurate predictions.” High R² indicates a strong reduction in squared error versus the mean-only baseline. Large errors can still occur for particular cases; prediction performance is better assessed with residual plots, RMSE, and validation.
“Low R² means no relationship.” Low R² indicates the linear model explains little of the response variation. Nonlinear patterns, restricted ranges of \(x\), or high measurement noise can produce low R² even when a meaningful relationship exists.
“R² implies causation.” R² is descriptive of fit, not causal structure. Causal claims require study design, controls, and assumptions beyond regression output.

Negative R2 values can appear when a model is evaluated without an intercept or when predictions are assessed out of sample; in such settings the fitted model can perform worse (in squared error) than predicting the mean of the observed responses.

Worked example (SST, SSE, SSR, and R2)

Consider four paired observations \((x_i,y_i)\): \((1,2)\), \((2,2)\), \((3,3)\), \((4,5)\). The least-squares fitted line (with intercept) is \(\hat{y}=0.5+1.0x\), producing fitted values \(\hat{y}_i\). The mean response is \(\bar{y}=(2+2+3+5)/4=3\).

\(i\) \(x_i\) \(y_i\) \(\hat{y}_i\) \(y_i-\bar{y}\) \((y_i-\bar{y})^2\) \(y_i-\hat{y}_i\) \((y_i-\hat{y}_i)^2\) \(\hat{y}_i-\bar{y}\) \((\hat{y}_i-\bar{y})^2\)
1121.5\(-1\)10.50.25\(-1.5\)2.25
2222.5\(-1\)1\(-0.5\)0.25\(-0.5\)0.25
3333.500\(-0.5\)0.250.50.25
4454.5240.50.251.52.25

Summing the squared columns yields \[ \mathrm{SST}=1+1+0+4=6,\qquad \mathrm{SSE}=0.25+0.25+0.25+0.25=1,\qquad \mathrm{SSR}=2.25+0.25+0.25+2.25=5. \] The decomposition \(\mathrm{SST}=\mathrm{SSR}+\mathrm{SSE}\) holds since \(6=5+1\). Therefore, \[ R^2=1-\frac{\mathrm{SSE}}{\mathrm{SST}}=1-\frac{1}{6}=\frac{5}{6}\approx 0.8333. \] About 83.33% of the total variation in \(y\) (around \(\bar{y}=3\)) is explained by the fitted line.

Visualization of explained vs unexplained variation

R squared meaning: explained variation (SSR) and residual variation (SSE) Left panel: scatter plot of four data points with a fitted regression line and a horizontal mean line. For each x, a green segment shows the distance from the mean to the fitted value (explained), and an orange segment shows the residual from the fitted value to the observed point (unexplained). Right panel: stacked bar showing SST split into SSR (green) and SSE (orange), with R squared computed as SSR divided by SST. Explained vs residual variation on the same data SST split into SSR and SSE 1 2 3 4 x 1 2 3 4 5 y mean ȳ = 3 fit: ŷ = 0.5 + 1.0x At x = 4: total (y − ȳ) = explained (ŷ − ȳ) + residual (y − ŷ) 2 = 1.5 + 0.5 explained: ŷ − ȳ (contributes to SSR) residual: y − ŷ (contributes to SSE) observed y Example totals SST = 6, SSR = 5, SSE = 1 SSE = 1 SSR = 5 SST = 6 R² = SSR / SST R² = 5 / 6 ≈ 0.8333 R² = 1 − SSE / SST = 1 − 1 / 6 = 5 / 6
Green segments represent variation explained by the fitted line (SSR), orange segments represent residual variation (SSE), and the stacked bar summarizes the identity \(\mathrm{SST}=\mathrm{SSR}+\mathrm{SSE}\) with \(R^2=\mathrm{SSR}/\mathrm{SST}\).
Vote on the accepted answer
Upvotes: 0 Downvotes: 0 Score: 0
Community answers No approved answers yet

No approved community answers are published yet. You can submit one below.

Submit your answer Moderated before publishing

Plain text only. Your name is required. Links, HTML, and scripts are blocked.

Fresh

Most recent questions

109 questions · Sorted by newest first

Showing 1–10 of 109
per page
  1. Mar 5, 2026 Published
    Formula of the Variance (Population and Sample)
    Statistics Numerical Descriptive Measures Measures of Dispersion for Ungrouped Data
  2. Mar 5, 2026 Published
    Mean Median Mode Calculator (Formulas, Interpretation, and Example)
    Statistics Numerical Descriptive Measures Measures of Central Tendency for Ungrouped Data
  3. Mar 4, 2026 Published
    How to Calculate Standard Deviation in Excel (STDEV.S vs STDEV.P)
    Statistics Numerical Descriptive Measures Measures of Dispersion for Ungrouped Data
  4. Mar 4, 2026 Published
    Suppose T and Z Are Random Variables: How T Relates to Z in the t Distribution
    Statistics Estimation of the Mean and Proportion Estimation of a Population Mean σ Not Known the T Distribution
  5. Mar 4, 2026 Published
    What Does R Squared Mean in Statistics (Coefficient of Determination)
    Statistics Simple Linear Regression Coefficient of Determination
  6. Mar 3, 2026 Published
    Box and Plot Graph (Box Plot) Explained
    Statistics Numerical Descriptive Measures Box and Whisker Plot
  7. Mar 3, 2026 Published
    How to Calculate a Z Score
    Statistics Continuous Random Variables and the Normal Distribution Standardizing a Normal Distribution
  8. Mar 3, 2026 Published
    How to Calculate Relative Frequency
    Statistics Organizing and Graphing Data Organizing and Graphing Quantitative Data
  9. Mar 3, 2026 Published
    Is zero an even number?
    Statistics Numerical Descriptive Measures Measures of Central Tendency for Ungrouped Data
  10. Mar 3, 2026 Published
    Monty Hall Paradox (Conditional Probability Explained)
    Statistics Probability Marginal and Conditional Probabilities
Showing 1–10 of 109
Open the calculator for this topic