Principal Component Analysis (pca) Preview

Math Linear Algebra • Applications and Advanced Linear Algebra (capstone)

Written by STEM Calculators Team

Compute PCA from a data matrix \(X\) (rows = samples, columns = features). The tool centers (and optionally standardizes) the data, builds the covariance matrix, finds eigenvalues/eigenvectors (principal components), reports variance explained, and projects to 2D/3D.

Samples (rows)

Features (columns)

Preprocessing

Tolerance

Display decimals

Data matrix \(X\)

Enter numeric values. PCA uses the chosen preprocessing then computes covariance eigenvalues.

Inputs accept -3.5, 2e-4, fractions 7/3, and constants pi, e.

Ready

Results

Eigenvalues

—

Explained variance (PC1)

—

Explained variance (PC1+PC2)

—

Notes

—

Means (and std if z-score)

—

Covariance / correlation matrix

—

Principal components (loadings) \(P\) (columns = PCs)

—

Projected data \(Z=X_{\text{prep}}P\)

—

Variance explained table

—

Graph

Interactive plot with tick values. Choose a view, zoom with wheel/trackpad, drag to pan, double-click to reset. Play animates drawing the scatter and PC arrows.

—

Step-by-step

Enter data and click “Calculate”.

Rate this calculator

0.0 /5 (0 ratings)

Be the first to rate.

Your rating

Name (optional) Review (optional)

You can update your rating any time.

Theory

What PCA does

Principal Component Analysis (PCA) is a method for turning a dataset with many correlated features into a new coordinate system where the axes are orthogonal and ordered by how much variance they explain. If \(X\in\mathbb{R}^{m\times d}\) is the data matrix (rows are samples, columns are features), PCA finds directions \(p_1,p_2,\dots,p_d\) (the principal components) such that:

\(p_1\) points in the direction of maximum variance of the data,
\(p_2\) is orthogonal to \(p_1\) and explains the next-largest variance,
and so on.

By projecting onto the first \(k\) components (often \(k=2\) or \(k=3\)), you get a lower-dimensional representation that keeps most of the variability in the original data. This is why PCA is used for visualization, compression, and noise reduction.

Centering, standardizing, and covariance

PCA is usually performed on centered data. Let \(\mu\in\mathbb{R}^d\) be the column means: \(\mu_j=\frac{1}{m}\sum_{i=1}^m X_{ij}\). The centered matrix is \[ X_c = X - \mathbf{1}\mu^{\mathsf T}. \] The sample covariance matrix is then \[ C = \frac{1}{m-1}X_c^{\mathsf T}X_c, \] which is symmetric and positive semidefinite.

Sometimes features are in different units (e.g., height in cm, weight in kg). In that case, a large-scale feature can dominate the covariance. A common fix is z-score standardization: \[ Z_{ij}=\frac{X_{ij}-\mu_j}{\sigma_j}, \] where \(\sigma_j\) is the sample standard deviation of feature \(j\). Running PCA on \(Z\) is equivalent to PCA on the correlation matrix.

Eigenvalues, explained variance, and components

PCA is an eigenvalue problem. Because \(C\) is symmetric, it has real eigenvalues \(\lambda_1\ge\lambda_2\ge\dots\ge\lambda_d\ge 0\) and an orthonormal eigenvector matrix \(P=[p_1\,p_2\,\dots\,p_d]\) such that \[ C P = P\Lambda,\qquad \Lambda=\mathrm{diag}(\lambda_1,\dots,\lambda_d). \] The eigenvectors \(p_i\) are the principal component directions in feature space.

The eigenvalues measure how much variance lies along each component. The explained variance ratio is \[ r_i=\frac{\lambda_i}{\sum_{j=1}^d \lambda_j}. \] If \(r_1\) is large (say 80%), then PC1 captures most of the variability, and a 1D projection already preserves much of the structure. The cumulative sum \(r_1+r_2+\cdots+r_k\) tells you how much variance is preserved by keeping the first \(k\) components.

Projection to 2D/3D and interpretation

Once \(P\) is computed, the PCA scores (projected coordinates) are obtained by multiplying the preprocessed data by the component matrix: \[ Z = X_{\text{prep}}P. \] Keeping the first \(k\) columns of \(P\) gives a reduced representation: \[ Z_k = X_{\text{prep}}P_k,\qquad P_k=[p_1\,\dots\,p_k]. \] In a 2D plot of \(Z_2\), the horizontal axis is the coordinate along PC1 and the vertical axis is the coordinate along PC2. A 3D plot uses PC1–PC3.

In the original feature plane (feature1 vs feature2), PCA arrows show the directions of maximal spread. For data with more than two features, drawing PC arrows in the feature1–feature2 plane is a projection of those PCs onto the first two coordinates; the full PC direction lives in the full \(d\)-dimensional space.

At a more advanced level, PCA can be extended beyond linear structure (e.g., kernel PCA), and other decompositions (SVD) provide numerically stable ways to compute PCA for large datasets.

Frequently Asked Questions

Why do we center the data before PCA?

Centering makes the covariance describe variability around the mean. Without centering, the first component can be dominated by the mean offset rather than spread.

When should I use z-score standardization?

Use it when features are measured in different units or scales. Z-scoring prevents large-scale features from dominating the covariance.

What does “explained variance” mean?

It is the fraction of total variance captured by each principal component. Large explained variance for PC1 indicates a strong dominant direction of spread.

Why might some eigenvalues be near zero?

Near-zero eigenvalues indicate directions with little variability, often meaning the data lies close to a lower-dimensional subspace.

Rate this calculator

Frequently Asked Questions

Related calculators