Transitional matrices (more commonly called transition matrices) organize conditional transition probabilities for a system that moves between a finite set of states over time. The central setting is a discrete-time Markov chain \(X_0, X_1, X_2, \dots\) taking values in \(\{1,2,\dots,n\}\).
The one-step transition probability from state \(i\) to state \(j\) is \[ p_{ij} = P(X_{t+1}=j \mid X_t=i). \] The transitional matrix is the \(n \times n\) matrix \(P=[p_{ij}]\).
Key properties of a transitional matrix
- Nonnegativity: \(p_{ij}\ge 0\) for all \(i,j\).
- Row sums equal 1 (row-stochastic convention): \[ \sum_{j=1}^n p_{ij}=1 \quad \text{for each fixed } i. \] This reflects that, given the current state \(i\), the next state must be one of \(1,\dots,n\).
- Conditional interpretation: each row is a conditional distribution of \(X_{t+1}\) given \(X_t=i\).
How transitional matrices are used
Two standard tasks are (1) updating a distribution over states and (2) computing multi-step transition probabilities.
If the row vector \(\boldsymbol{\pi}_t\) contains state probabilities at time \(t\), \(\boldsymbol{\pi}_t = \big(P(X_t=1),\dots,P(X_t=n)\big)\), then \[ \boldsymbol{\pi}_{t+1} = \boldsymbol{\pi}_t \cdot P. \]
The \(k\)-step transition probability is \[ P(X_{t+k}=j \mid X_t=i) = (P^k)_{ij}. \] In particular, \(P^2 = P \cdot P\) gives two-step transition probabilities.
Worked example with a 3-state transitional matrix
Consider three states \(\{1,2,3\}\) with the transitional matrix
| From \(\backslash\) To | \(1\) | \(2\) | \(3\) |
|---|---|---|---|
| \(1\) | \(0.7\) | \(0.2\) | \(0.1\) |
| \(2\) | \(0.3\) | \(0.4\) | \(0.3\) |
| \(3\) | \(0.2\) | \(0.3\) | \(0.5\) |
Visualization: state diagram representation of the transitional matrix
Compute a two-step probability using \(P^2\)
The two-step transition probability from state 1 to state 3 is \((P^2)_{13}\). One entry can be computed directly as a conditional-probability sum over intermediate states:
\[ (P^2)_{13} = \sum_{k=1}^3 p_{1k}\cdot p_{k3} = (0.7\cdot 0.1) + (0.2\cdot 0.3) + (0.1\cdot 0.5) = 0.07 + 0.06 + 0.05 = 0.18. \]
The full matrix power \(P^2 = P\cdot P\) is
| \(P^2\) | \(1\) | \(2\) | \(3\) |
|---|---|---|---|
| \(1\) | \(0.57\) | \(0.25\) | \(0.18\) |
| \(2\) | \(0.39\) | \(0.31\) | \(0.30\) |
| \(3\) | \(0.33\) | \(0.31\) | \(0.36\) |
Long-run behavior: stationary distribution
A stationary distribution \(\boldsymbol{\pi}=(\pi_1,\pi_2,\pi_3)\) satisfies \[ \boldsymbol{\pi} = \boldsymbol{\pi}\cdot P \quad \text{and} \quad \pi_1+\pi_2+\pi_3=1. \] Writing \(\boldsymbol{\pi}=\boldsymbol{\pi}\cdot P\) component-wise gives:
- \[ \pi_1 = (0.7\cdot \pi_1) + (0.3\cdot \pi_2) + (0.2\cdot \pi_3) \;\;\Rightarrow\;\; (0.3\cdot \pi_1) - (0.3\cdot \pi_2) - (0.2\cdot \pi_3)=0. \]
- \[ \pi_2 = (0.2\cdot \pi_1) + (0.4\cdot \pi_2) + (0.3\cdot \pi_3) \;\;\Rightarrow\;\; (-0.2\cdot \pi_1) + (0.6\cdot \pi_2) - (0.3\cdot \pi_3)=0. \]
- Together with \(\pi_1+\pi_2+\pi_3=1\), solving yields \[ \boldsymbol{\pi}=\left(\frac{21}{46},\frac{13}{46},\frac{12}{46}\right) \approx (0.4565,\,0.2826,\,0.2609). \]
Under standard regularity conditions (for example, an irreducible and aperiodic chain), the distribution \(\boldsymbol{\pi}_t\) approaches the stationary distribution, and transitional matrices provide the computational framework for that convergence through \(P^k\).