10. Solving Simultaneous Linear Equations, Part 2: Partial Pivoting#

References:

Note: Some references describe the method of scaled partial pivoting, but here we present instead a version without the “scaling”, because not only is it simpler, but modern research shows that it is esentially always as good, once the problem is set up in a “sane” way.

10.1. Introduction#

The basic row reduction method can fail due to divisoion by zero (and to have very large rouding errors when a denominator is extremely close to zero. A more robust modification is to swap the order of the equations to avaid these problems: partial pivotng. Here we look at a particularly robust version of this strategy, Maximal Element Partial Pivoting.

10.2. What can go wrong with Naive Gaussian Elimination?#

We have noted two problems with the naive algorithm for Gaussian elimination: total failure due the division be zero, and loss of precision due to dividing by very small values — or more preciselt calculations the lead to intermediate values far larger than the final results. The culprits in all cases are the same: the denominators are first the pivot elements \(a_{k,k}^{(k-1)}\) in evaluation of \(l_{i,k}\) during row reduction and then the \(u_{k,k}\) in back substitution. Further, those \(a_{k,k}^{(k-1)}\) are the final updated values at indices \((k,k)\), so are the same as \(u_{k,k}\). Thus it is exactly these main diagonal elements that we must deal with.

10.3. The basic fix: Partial Pivoting#

The basic strategy is that at step \(k\), we can swap equation \(k\) with any equation \(i\), \(i > k\). Note that this involves swapping those rows of array A and also those elements of the array b for the right-hand side: \(b_k \leftrightarrow b_i\).

This approach of swapping equations (swapping rows in arrays A andb) is called pivoting, or more specifically partial pivoting, to distinguish from the more elaborate strategy where to columns of A are also reordered (which is equivalent to reordeting the unknowns in the equations). The row that is swapped with row \(k\) is sometimes called the pivot row, and the new denominator is the corresponding pivot element.

This approach is robust so long as one is using exact arithmetic: it works for any well-posed system because so long as the \(Ax = b\) has a unique solution — so that the original matrix \(A\) is non-singular — at least one of the \(a_{i,k}^{(k-1)}, i \geq k\) will be non-zero, and thus the swap will give a new element in position \((k,k)\) that is non-zero. (I will stop caring about superscripts to distinguish updates, but if you wish to, the elements of the new row \(k\) could be called either \(a_{k,j}^{(k)}\) or even \(u_{k,j}\), since those values are in their final state.)

10.4. Handling rounding error: Maximal Element Partial Pivoting#

The final refinement is to seek the smallest possible magnitudes for intermediate values, and thus the smallest absolute errors in them, by making the multipliers \(l_{i,k}\) small, in turn by making the denominator \(a_{k,k}^{(k-1)} = u_{k,k}\) as large as possible in magnitude:

At step \(k\), choose the pivot row \(p_k \geq k\) so that \(|a_{p_k,k}^{(k-1)}| \geq |a_{i,k}^{(k-1)}|\) for all \(i \geq k\). If there is more that one such element of largest magnitude, use the lowest value: in particular, if \(p_k = k\) works, use it and do not swap!

10.5. Swapping rows in Julia#

I will not give a detailed algorithm for this, since we will soon implement an even better variant.

However, here are some notes on swapping values and how to avoid a possible pitfall.

10.5.1. Exercise 1, on Julia coding#

a) Explain why we cannot just swap the relevant elements of rows \(k\) and \(p\) with:

for j in 1:n
    A[k,j] = A[p,j]
    A[p,j] = A[k,j]
end

or with vectorized “slicing”:

A[k,:] = A[p,:]
A[p,:] = A[k,:]

Describe what happens instead.

b) A common strategy to avoid this problem uses an intermediate temporary copy of each value being “moved”. This can be combined with slicing, but be careful: arrays (including slices of arrays) must be copied with the function copy():

temp = copy(A[k,:])
A[k,:] = A[p,:]
A[p,:] = temp
  1. Julia has an elegant alternative for swapping a pair of values; (a, b) = (b, a) so the first fail above can also be fixed with

    for j in 1:n ( A[k,j] , A[p,j] ) = ( A[p,j] , A[k,j] ) end

This can also be done with slicing, with care to copy those slices, so we can fix the second fail with:

( A[k,:] , A[p,:] ) = ( copy(A[p,:]) , copy(A[k,:]) )

10.5.2. Some demonstrations#

No row reduction is done here, so entire rows are swapped rather than just the elements from column \(k\) onward:

First, get the matrix pretty-printer seen in Solving Simultaneous Linear Equations, Part 1: Row Reduction/Gaussian Elimination; this time from Module NumericalMethodsAndAnalysis:

# Tell Julia where I keep my modules; for now in a "sister" folder to where the notebooks are:
push!(LOAD_PATH, "../modules");
import NumericalMethodsAndAnalysis: print_matrix
A = [1 -6 2 ; 3 5 -6 ; 4 2 7]
n = 3
println("Initially A is"); print_matrix(A)
Initially A is
[
1 -6 2 
3 5 -6 
4 2 7 
]
k = 1
p = 3
temp = copy(A[k,:])
A[k,:] = A[p,:]
A[p,:] = temp
println("After swapping rows 1 <-> 3 using slicing and a temporary row, A is")
print_matrix(A)
After swapping rows 1 <-> 3 using slicing and a temporary row, A is
[
4 2 7 
3 5 -6 
1 -6 2 
]
k = 2
p = 3
for j in 1:n
    ( A[k,j] , A[p,j] ) = ( A[p,j] , A[k,j] )
end
println("After swapping rows 2 <-> 3 using a loop and tuples of elements (no temp) A is:")
print_matrix(A)
After swapping rows 2 <-> 3 using a loop and tuples of elements (no temp) A is:
[
4 2 7 
1 -6 2 
3 5 -6 
]
k = 1
p = 2
( A[k,:] , A[p,:] ) = ( copy(A[p,:]) , copy(A[k,:]) )
println("After swapping rows 1 <-> 2 using tuples of slices (no loop or temp) A is:")
print_matrix(A)
After swapping rows 1 <-> 2 using tuples of slices (no loop or temp) A is:
[
1 -6 2 
4 2 7 
3 5 -6 
]

This work is licensed under Creative Commons Attribution-ShareAlike 4.0 International