Partial Pivoting

2.3. Partial Pivoting#

Last Revised on August 17, 2025

References:

Section 3.3, Partial pivoting, of Chasnov [Cha12].
Section 2.4.1, Partial Pivoting, of Sauer [Sau22].
Section 6.2, Pivoting Stratgies, of Burden, Faires, and Burden [BFB16].
Section 4.1, Gaussian Elimination with Backward Substitution, of Dionne [Dio23].
Section 2.2, Gaussian Elimination with Scaled Partial Pivoting, of Chenney and Kincaid [CK13].
Lecture 21, Pivoting, of Trefethen and Bau [TB22]. (However, this again is done directly in the context of the LU factorization, as in section Solving Ax = b With Both Pivoting and LU factorization).
Section 7.2.3, Pivoting Strategies, of Dahlquist and Björck [DBjorck09].

2.3.1. Introduction#

The basic row reduction method can fail due to division by zero, and even if that is avoided it might have very large rounding errors when a denominator is extremely close to zero. A more robust modification is to swap the order of the equations to avoid these problems: partial pivoting.

Here we look at a particularly robust version of this strategy, Maximal Element Partial Pivoting.

Remark 2.18

Some references describe the method of scaled partial pivoting, but here we present instead a version without the “scaling”, because not only is it simpler, but modern research shows that it is essentially always as good, once the problem is set up in a “sane” way.

import numpy as np
import matplotlib.pyplot as plt

# As in recent sections, we import some items from modules individually, so they can be used by "first name only".
from numpy import array

2.3.2. The Maximum Norm of a Vector#

To discuss errors in linear algebra, it is useful to have a measure of the “size” of a vector, such as in describing the differerence between an exact solution vector and an approximation. For this, the usual Euclidean norm

\[\| x \| = \| x \|_2 = \sqrt{\sum_{i=1}^n |x_i|^2}\]

is not always well-suited. Instead we will mostly use the maximum norm or infinity norm

\[ \| x \|_{max} = \| x \|_\infty : = \max_{i=1}^n |x_i| \]

For more details — including an extension to measuring the size of a matrix and an explanation of that mysterious name infinity norm — see section Error bounds for linear algebra, condition numbers, matrix norms, etc..

2.3.3. What can go wrong with Naive Gaussian Elimination?#

We have noted two problems with the naive algorithm for Gaussian elimination: total failure due the division be zero, and loss of precision due to dividing by very small values — or more precisely, calculations that lead to intermediate values far larger than the final results. The culprits in all cases are the same: the denominators are first the pivot elements \(a_{k,k}^{(k-1)}\) in evaluation of \(l_{i,k}\) during row reduction and then the \(u_{k,k}\) in back substitution. Further, those \(a_{k,k}^{(k-1)}\) are the final updated values at indices \((k,k)\), so are the same as \(u_{k,k}\).

Thus it is exactly these main diagonal elements that we must deal with carefully.

2.3.4. The basic fix: Partial Pivoting#

The basic strategy is that at step \(k\), we can swap equation \(k\) with any later equation \(i_k\), \(i_k > k\). Note that this involves swapping those rows of array A and also those elements of the array b for the right-hand side: \(b_k \leftrightarrow b_{i_k}\).

This approach of swapping equations (swapping rows in arrays A and b) is called pivoting, or more specifically partial pivoting, to distinguish from the more elaborate strategy where to columns of A are also reordered (which is equivalent to reordering the unknowns in the equations). The row that is swapped with row \(k\) is sometimes called the pivot row, and the new denominator is the corresponding pivot element.

This approach is robust so long as one is using exact arithmetic: it works for any well-posed system because so long as the \(Ax = b\) has a unique solution — so that the original matrix \(A\) is non-singular — at least one of the \(a_{i,k}^{(k-1)}, i \geq k\) will be non-zero, and thus the swap will give a new element in position \((k,k)\) that is non-zero. (I will stop caring about superscripts to distinguish updates, but if you wish to, the elements of the new row \(k\) could be called either \(a_{k,j}^{(k)}\) or even \(u_{k,j}\), since those values are in their final state.)

2.3.5. Handling rounding error: Maximal Element Partial Pivoting#

The final refinement is to seek the smallest possible magnitudes for intermediate values, and thus the smallest absolute errors in them, by making the multipliers \(l_{i,k}\) small, in turn by making the denominator \(a_{k,k}^{(k-1)} = u_{k,k}\) as large as possible in magnitude:

At step \(k\), choose the pivot row \(p_k \geq k\) so that \(|a_{p_k,k}^{(k-1)}| \geq |a_{i,k}^{(k-1)}|\) for all \(i \geq k\). If there is more that one such element of largest magnitude, use the lowest value: in particular, if \(p_k = k\) works, use it and do not swap!

2.3.6. Swapping rows in Python#

I will not give a detailed algorithm for this, since we will see a better variant in Solving Ax = b With Both Pivoting and LU factorization.

However, here are some notes on swapping values; see also Exercise 2.6 about a possible pitfall.

2.3.6.1. Some demonstrations#

No row reduction is done here, so entire rows are swapped rather than just the elements from column \(k\) onward:

A = array([[1. , -6. , 2.],[3. , 5. , -6.],[4. , 2. , 7.]])
n = 3
print(f"Initially,\nA =\n {A}")

Initially,
A =
 [[ 1. -6.  2.]
 [ 3.  5. -6.]
 [ 4.  2.  7.]]

k = 0
p = 2
temp = A[k,k:].copy()
A[k,k:] = A[p,k:]
A[p,k:] = temp
print("After swapping rows 1 <-> 3 (row indices 0 <-> 2) using slicing and a temporary row,")
print(f"A =\n {A}")

After swapping rows 1 <-> 3 (row indices 0 <-> 2) using slicing and a temporary row,
A =
 [[ 4.  2.  7.]
 [ 3.  5. -6.]
 [ 1. -6.  2.]]

k = 1
p = 2
for j in range(n):
    ( A[k,j] , A[p,j] ) = ( A[p,j] , A[k,j] )
print(f"After swapping rows 2 <-> 3 using a loop and tuples of elements, no temp,")
print(f"A =\n {A}")

After swapping rows 2 <-> 3 using a loop and tuples of elements, no temp,
A =
 [[ 4.  2.  7.]
 [ 1. -6.  2.]
 [ 3.  5. -6.]]

k = 0
p = 1
( A[k,k:] , A[p,k:] ) = ( A[p,k:].copy() , A[k,k:].copy() )
print(f"After swapping rows 1 <-> 2 using tuples of slices, no loop or temp,")
print(f"A =\n {A}")

After swapping rows 1 <-> 2 using tuples of slices, no loop or temp,
A =
 [[ 1. -6.  2.]
 [ 4.  2.  7.]
 [ 3.  5. -6.]]

2.3.7. Exercises#

Exercise 2.6 (On Python coding)

a) Explain why we cannot just swap the relevant elements of rows \(k\) and \(p\) with:

for j in range(k,n):
    A[k,j] = A[p,j]
    A[p,j] = A[k,j]

or with vectorized “slicing”:

A[k,k:] = A[p,k:]
A[p,k:] = A[k,k:]

Describe what happens instead.

b) A common strategy to avoid this problem uses an intermediate temporary copy of each value being “moved”. This can be combined with slicing, but be careful: arrays (including slices of arrays) must be copied with the method .copy():

temp = A[k,k:].copy()
A[k,k:] = A[p,k:]
A[p,k:] = temp

Python also has an elegant alternative for swapping a pair of values:

for j in range(k,n): ( A[k,j] , A[p,j] ) = ( A[p,j] , A[k,j] )

This can also be done with slicing, with care to copy those slices:

( A[k,k:] , A[p,k:] ) = ( A[p,k:].copy() , A[k,k:].copy() )

Exercise 2.7

Solve \(A \vec{x} = \vec b\) with

\[\begin{split} A = \left[ \begin{array}{ccc} 4. & 2. & 1. \\ 9. & 3. & 1. \\ 25. & 5. & 1. \end{array} \right], \quad \vec b = \left[ \begin{array}{c} 0.693147 \\ 1.098612 \\ 1.609438 \end{array} \right] \end{split}\]

as seen in Exercise 2.1 of Row Reduction (Gaussian Elimination), now using maximal element partial pivoting, computing all intermediate values as decimal approximations rounded to four significant digits – no fractions! Call this approximate solution \(\vec{x}_2\).

Then compute the residual \(\vec{r}_2 = \vec{b} - A \vec{x}_2\) and its norm \(\| \vec{r}_2 \|_\infty\). This should be done with high accuracy (not rounding to four decimal places) and could be done using a Python notebook as a “calculator”.

Next, use Python software to compute the condition number of \(A\), \(K_\infty(A)\), and use this to get a bound on the absolute error in each of the above approximations.

Finally some “cheating”: use Python software to compute the “exact” solution and use this to compute the actual absolute error; compare this to the bound computed above.