2.9. Faster Methods for Solving \(Ax=b\) with Tridiagonal and Banded Matrices#
Last revised on November 5, 2025,
adding links to Python code for solving tri-diagonal systems with LU factorization, which is now available in module numerical_methods at Tridiagonal system solving.
References:
[Sullivan, 2021] Section 4.4.3 Solving Triangular Systems.
[Burden et al., 2016] Section 6.6, Special Types of Matrices — just the sub-sections on Band Matrices and Tridiagonal Matrices.
[Chenney and Kincaid, 2013] Section 2.3, Tridiagonal and Banded Systems.
[Dahlquist and Björck, 2009] Section 7.4, Banded Linear Systems.
2.9.1. Tridiagonal systems#
Differential equations often lead to the need to solve systems of equations \(T x = b\) where the matrix \(T\) has this speical form:
Definition 2.12 (Tridiagonal matrix)
A matrix \(T\) is tridiagonal if the only non-zero elements are on the main diagonal and the diagonals adjacent to it on either side, so that \(T_{i,j} = 0\) if \(|i - j| > 1\). That is, the system looks like:
with all “missing” entries being zeros. The notation used here suggests one efficient way to store such a matrix: as three 1D arrays \(d\), \(l\) and \(u\).
(Such equations also arise in other important situations, such as spline interpolation.
It can be verified that LU factorization preserves all the non-zero values, so that the Doolittle algorithm — if it succeeds without any division by zero — gives \(T = L U\) with the form
Note that the first non-zero element in each column is unchanged, as with a full matrix, but now it means that the upper diagonal elements \(u_i\) are unchanged.
This also means that with naive row redution,
the only multipliers needed are \(l_{i+1,i}, = L_i\), \(1 \leq i < n\)
the only other values the need to be computed in order to describe the row-reduced form are the diagonal elements, \(u_{i,i} = D_i\), along with the already known values \(u_{i,i+1} = u_i\).
Again, one way to describe and store this information is with just the two new 1D arrays \(L\) and \(D\), along with the unchanged array \(u\).
2.9.2. Algorithms#
Algorithm 2.16 (LU factorization)
\(D_1 = d_1\)
for i from 2 to n
\(\quad L_{i-1} = l_{i-1}/D_{i-1}\)
\(\quad D_{i} = d_{i} - L_{i-1} u_{i-1}\)
end
Algorithm 2.17 (Forward substitution)
\(c_1 = b_1\)
for i from 2 to n
\(\quad c_{i} = b_{i} - L_{i-1} c_{i-1}\)
end
Algorithm 2.18 (Backward substitution)
\(x_n = c_n/D_n\)
for i from n-1 down to 1
\(\quad x_{i} = (c_{i} - u_{i} x_{i+1})/D_i\)
end
Python code for these is in the module numerical_methods in the section Tridiagonal system solving.
Alternatively, one can solve by the basic row-reduction approach. This makes sense when only one right-hand side \(b\) is to be solved for; see Exercise 2.20 for a “cost” comparison.
Algorithm 2.19 (row reduction with a triangular matrix)
\(c_1 = b_1\)
\(D_1 = d_1\)
for i from 2 to n
\(\quad L_{i-1} = l_{i-1}/D_{i-1}\)
\(\quad c_{i} = b_{i} - L_{i-1} c_{i-1}\)
\(\quad D_{i} = d_{i} - L_{i-1} u_{i-1}\)
end
and then backward substitution as in Algorithm 2.18
Alternatively, one can just do a customized version of row reduction; for example when only one right-hand side \(b\) is to be solved for:
2.9.3. Generalizing to banded matrices#
As we have seen, approximating derivatives to higher order of accuracy and approximating derivatives of order greater than two requires more than three nodes, but the locations needed are all close to the ones where the derivative is being approximated. For example, the simplest symmetric approximation of the fourth derivative \(D^4 f(x)\) used values from \(f(x-2h)\) to \(f(x+2h)\). Then row \(i\) of the corresponding matrix has all its non-zero elements at locations \((i,i-2)\) to \((i, i+2)\): the non-zero elements lie in the narrow “band” where \(|i - j| \leq 2\), and thus on five “central” diagonals.
This is a penta-digonal matrix, and an example of the larger class of banded matrices: ones in which all the non-zero elements have indices \( -p \leq j - i \leq q\) for \(p\) and \(q\) smaller than \(n\) — usually far smaller; \(p = q = 2\) for a penta-digonal matrix.
Let us recap the general Doolittle algorithm for computing an LU factorization:
Algorithm 2.20 (Doolittle algorithm for computing an LU factorization)
The top row is unchanged:
for j from 1 to n
\(\quad\) \(u_{1,j} = a_{1,j}\)
end
The left column requires no sums:
for i from 2 to n
\(\quad\) \(l_{i,1} = a_{i,1}/u_{1,1}\)
end
The main loop:
for k from 2 to n
\(\quad\) for j from k to n
\(\quad\quad\) \(u_{k,j} = a_{k,j} - \sum_{s=1}^{k-1} l_{k,s}u_{s,j}\)
\(\quad\) end
\(\quad\) for i from k+1 to n
\(\quad\quad\) \(l_{i,k} = \left( a_{i,k} - \sum_{s=1}^{k-1} l_{i,s}u_{s,k} \right)/u_{k,k}\)
\(\quad\) end
end
2.9.3.1. Eliminating redundant calculation in the above#
With a banded matrix, many of the entries at right are zero, particularly in the two sums, which is where most of the operations are. Thus we can rewrite, exploiting the fact that all elements with indices \(j-i < -p\) or \(j-i > q\) are zero. To start with, the top diagonal is not modified, as already noted for the tridiagonal case: \(u_{k,k+q} = a_{k,k+q}\) for \(1 \leq k \leq n-q\).
Algorithm 2.21 (LU factorization of a banded matrix)
The top row is unchanged:
for j from 1 to 1+q
\(\quad\) \(u_{1,j} = a_{1,j}\)
end
The top non-zero diagonal is unchanged:
for k from 1 to n - q
\(\quad\) \(u_{k,k+q} = a_{k,k+q}\)
end
The left column requires no sums:
for i from 2 to 1+p
\(\quad\) \(l_{i,1} = a_{i,1}/u_{1,1}\)
end
The main loop:
for k from 2 to n
\(\quad\) for j from k to min(n, k+q-1)
\(\quad\quad\) \(u_{k,j} = a_{k,j} - \displaystyle\sum_{s=max(1, k-p, j-q)}^{k-1} l_{k,s}u_{s,j}\)
\(\quad\) end
\(\quad\) for i from k+1 to min(n,k+p-1)
\(\quad\quad\) \(l_{i,k} = \left( a_{i,k} - \displaystyle\sum_{s=max(1,i-p,k-q)}^{k-1} l_{i,s}u_{s,k} \right)/u_{k,k}\)
\(\quad\) end
end
It is common for a banded matrix to have equal band-width on either side, \(p=q\), as with tridiagonal and pentadiagonal matrices. Then the algorithm is somewhat simpler:
Algorithm 2.22 (LU factorization of a banded matrix, \(p=q\))
The top row is unchanged:
for j from 1 to 1+p
\(\quad\) \(u_{1,j} = a_{1,j}\)
end
The top non-zero diagonal is unchanged:
for k from 1 to n - p
\(\quad\) \(u_{k,k+p} = a_{k,k+p}\)
end
The left column requires no sums:
for i from 2 to 1+p
\(\quad\) \(l_{i,1} = a_{i,1}/u_{1,1}\)
end
The main loop:
for k from 2 to n
\(\quad\) for j from k to min(n, k+p-1)
\(\quad\quad\) \(u_{k,j} = a_{k,j} - \displaystyle\sum_{s=max(1, j-p)}^{k-1} l_{k,s}u_{s,j}\)
\(\quad\) end
\(\quad\) for i from k+1 to min(n,k+p)
\(\quad\quad\) \(l_{i,k} = \left( a_{i,k} - \displaystyle\sum_{s=max(1,i-p)}^{k-1} l_{i,s}u_{s,k} \right)/u_{k,k}\)
\(\quad\) end
end
2.9.4. Strict diagonal dominance helps again#
These algorithms for banded matrices do no pivoting, and that is highly desirable, because pivoting creates non-zero elements outside the “band” and so can force one back to the general algorithm. Fortunately, we have seen one case where this is fine: the matrix being either row-wise or column-wise strictly diagonally dominant.
2.9.5. Exercises#
Exercise 2.19
Compute the operation counts for solving tridiagonal systems with LU factorization: Algorithm 2.16 to Algorithm 2.18.
Exercise 2.20
Compute the operation counts for solving tridiagonal systems by row reduction, as in Algorithm 2.19.
Compare this to the counts for solving by forward- and backward-substituton once one already has done the LU factorization.
Compare, and comment on to what extent using the LU factorization is beneficial with tridiagonal matrices.
Exercise 2.21
Compute the operation counts for the LU factorization of a banded matrix, Algorithm 2.21.
Exercise 2.22
Create algoithms for forward- and backward-substitution with the banded matrix LU factorization produced by Algorithm 2.21.
Calculate the operation counts for these, and thus for the process of solving \(Ax = b\) with banded matrix \(A\) once one already has the factors \(L\) and \(U\).