# 1.2. Solving Equations by Fixed Point Iteration (of Contraction Mappings)#

Last revised on August 14, 2024

**References:**

Section 1.2

*Fixed-Point Iteration*of [Sauer, 2022]Section 2.2

*Fixed-Point Iteration*of [Burden*et al.*, 2016]

## 1.2.1. Introduction#

In the next section we will meet Newton’s method for root-finding,
which you might have seen in a calculus course.
This is one very important example of the more general strategy of **fixed-point iteration**, so we start with that.

```
# We will often need resources from the modules numpy and pyplot:
import numpy as np
# We can also import items from a module individually, so they can be used by "first name only".
# In this book this is done mostly for mathematical functions with familiar names:
from numpy import cos
# and some very frequently used functions for graphing:
from matplotlib.pyplot import figure, plot, title, legend, grid
```

## 1.2.2. Fixed-point form for equations#

A variant of stating equations as *root-finding* (\(f(x) = 0\)) is *fixed-point* form:
given a function \(g:\mathbb{R} \to \mathbb{R}\) or \(g:\mathbb{C} \to \mathbb{C}\)
(or even \(g:\mathbb{R}^n \to \mathbb{R}^n\); a later topic),
find a *fixed point* of \(g\).
That is, a value \(p\) for its argument such that

Such problems are interchangeable with root-finding. One way to convert from \(f(x) = 0\) to \(g(x) = x\) is defining

for any “weight function” \(w(x)\).

One can convert the other way too, for example defining \(f(x) := g(x) - x\). We have already seen this when we converted the equation \(x = \cos x\) to \(f(x) = x - \cos x = 0\).

Compare the two setups graphically: in each case, the \(x\) value at the intersection of the two curves is the solution we seek.

```
def f_1(x): return x - cos(x)
def g_1(x): return cos(x)
```

```
a = -1
b = 1
x = np.linspace(a, b)
figure(figsize=(12,5))
title("$y = f_1(x) = x - \cos(x)$ and $y=0$")
plot(x, f_1(x))
plot([a, b], [0, 0])
grid(True)
figure(figsize=(12,5))
title("$y = g_1(x) = \cos (x)$ and $y=x$")
plot(x, g_1(x))
plot(x, x)
grid(True)
```

## 1.2.3. Fixed-point iteration#

Fixed point form can be convenient partly because we almost always have to solve by successive approximations,
or *iteration*, and fixed point form suggests *one* choice of iterative procedure:
start with any first approximation \(x_0\), and iterate with

If \(g\) is continuous, and **if** the above sequence \(\{x_0, x_1, \dots \}\) converges to a limit \(p\), then that limit is a fixed point of function \(g\): \(g(p) = p\).

Proof. From \(\displaystyle \lim_{k \to \infty} x_k = p\), continuity gives

On the other hand, \(g(x_k) = x_{k+1}\), so

Comparing gives \(g(p) = p\).

## 1.2.4. Mappings and contraction mappings#

That second “if” is a big one;
Fortunately, it can often be resolved using the idea of a *contraction mapping*.

First, we need the idea of a **mapping**.

(Mapping)

A function \(g(x)\) defined on a closed interval \(D = [a, b]\) which sends values back into that interval,
\(g: D \to D\), is sometimes called a **map** or **mapping**.

The same applies for a function \(g: D \to D\) where \(D\) is a subset of the complex numbers, or even of the vectors \(\mathbb{R}^n\) or \(\mathbb{C}^n\). This will help later with solving more general types of equations.

A mapping is sometimes thought of as moving a region \(S\) within its domain \(D\) to another such region, by moving each point \(x \in S \subset D\) to its image \(g(x) \in g(S) \subset D\).

Any continuous mapping on a closed interval \([a, b]\) has at least one fixed point.

Proof. Consider the “root-finding cousin”, \(f(x) = x - g(x)\).

First, \(f(a) = a - g(a) \leq 0\), since \(g(a) \geq a\) so as to be in the domain \([a,b]\) — similarly, \(f(b) = b - g(b) \geq 0\).

From the Intermediate Value Theorem, \(f\) has a zero \(p\), where \(f(p) = p - g(p) = 0\).

In other words, the graph of \(y=g(x)\) goes from being above the line \(y=x\) at \(x=a\) to below it at \(x=b\), so at some point \(x=p\), the curves meet: \(y = x = p\) and \(y = g(p)\), so \(p = g(p)\).

Let us illustrate this with the mapping \(g_4(x) := 4 \cos x\), for which the fact that \(|g_4(x)| \leq 4\) ensures that this is a map of the domain \(D = [-4, 4]\) into itself:

```
def g_4(x): return 4*cos(x)
a = -4
b = 4
x = np.linspace(a, b)
figure(figsize=(8,8))
title("Fixed points of the map $g_4(x) = 4 \cos(x)$")
plot(x, g_4(x), label="$y=g_4(x)$")
plot(x, x, label="$y=x$")
legend()
grid(True);
```

The above example has multiple fixed points (three of them).
To ensure the existence of a **unique** solution, and convergence of the iteration to that solution, we use an extra condition.

(Contraction Mapping)

A mapping \(g:D \to D\), is called a **contraction** or **contraction mapping** if there is a constant \(C < 1\) such that

for any \(x\) and \(y\) in \(D\).
We then call \(C\) a *contraction constant*.

It is not enough to have \(| g(x) - g(y) | < | x - y |\) or \(C = 1\)! We need the ratio \(\displaystyle \frac{|g(x) - g(y)|}{|x - y|}\) to be

*uniformly*less than one for all possible values of \(x\) and \(y\).The same applies for a domain in \(\mathbb{C}\), \(\mathbb{R}^n\) or \(\mathbb{C}^n\): for vectors, just replace the absolute value \(| \dots |\) by the vector norm \(\| \dots \|\). In fact, we will sometimes blur the distinction by using the “single line” absolute value notation for vector norms too.

(A Contraction Mapping Theorem)

Any contraction mapping on a closed, bounded interval \(D = [a, b]\) has exactly one fixed point \(p\) in \(D\).
Further, this can be calculated as the limit \(\displaystyle p = \lim_{k \to \infty} x_k\) of the iteration sequence given by \(x_{k+1} = g(x_{k})\) for *any* choice of the starting point \(x_{0} \in D\).

Proof. First, uniqeness: between any two of the multiple fixed points above — call them \(p_0\) and \(p_1\) — the graph of \(g(x)\) has to rise with secant slope 1: \((g(p_1) - g(p_0)/(p_1 - p_0) = (p_1 - p_0)/(p_1 - p_0) = 1\), and this violates the contraction property.

So instead, for a contraction, the graph of a contraction map looks like the one below for our favorite example, \(g(x) = \cos x\) (which we will soon verify to be a contraction on interval \([-1, 1]\)):

The second claim, about **convergence** to the fixed point from any initial approximation \(x_0\),
will be verified below, once we have seen some ideas about measuring errors.

## 1.2.5. An easy way of checking whether a differentiable function is a contraction#

With differentiable functions, the contraction condition can often be easily verified using derivatives:

(A derivative-based fixed point theorem)

If a function \(g:[a,b] \to [a,b]\) is differentiable and there is a constant \(C < 1\) such that \(|g'(x)| \leq C\) for all \(x \in [a, b]\), then \(g\) is a contraction mapping with this contraction constant, and so has a unique fixed point in this interval.

Proof. Using the Mean Value Theorem, \(g(x) - g(y) = g'(c)(x - y)\) for some \(c\) between \(x\) and \(y\). Then taking absolute values,

\(g(x) = \cos(x)\) is a contraction on interval \([-1,1]\))

(Our favorite example \(g(x) = \cos(x)\) is a contraction, but we have to be a bit careful about the domain.

For all real \(x\), \(g'(x) = -\sin x\), so \(|g'(x)| \leq 1\); this is almost but not quite enough.

However, we have seen that iteration values will settle in the interval \(D = [-1,1]\), and considering \(g\) as a mapping of this domain, \(|g'(x)| \leq \sin(1) = 0.841\dots < 1\): that is, now we have a contraction, with \(C = \sin(1) \approx 0.841\).

And as seen in the graph above, there is indeed a unique fixed point.

## 1.2.6. The contraction constant \(C\) as a measure of how fast the approximations improve (the smaller the better)#

It can be shown that if \(C\) is small (at least when one looks only at a reduced domain \(|x - p| < R\)) then the convergence is “fast” once \(|x_k - p| < R\).

To see this, we define some jargon for talking about errors. (For more details on error concepts, see the section on error measures and convergence rates.)

(Error)

The **error** in \(\tilde x\) as an approximation to an exact value \(x\) is

This will often be abbreviated as \(E\).

(Absolute Error)

The **absolute error** in \(\tilde x\) an approximation to an exact value \(x\) is the magnitude of the error:
the absolute value \(|E| = |\tilde x - x|\).

This will later be extended to \(x\) and \(\tilde x\) being vectors, by again using the vector norm in place of the absolute value.

In the case of \(x_k\) as an approximation of \(p\), we name the error \(E_k := x_k - p\). Then \(C\) measures a worst case for how fast the error decreases as \(k\) increases, and this is “exponentially fast”:

\(|E_{k+1}| \leq C |E_{k}|\), or \(|E_{k+1}|/|E_{k}|\leq C\), and so

That is, the error decreases at worst in a geometric sequence, which is exponential decrease with respect to the variable \(k\).

Proof. \(E_{k+1} = x_{k+1} - p = g(x_{k}) - g(p)\), using \(g(p) = p\). Thus the contraction property gives

Applying this again,

and repeating \(k-2\) more times,

We will often use this “recursive” strategy of relating the error in one iterate to that in the previous iterate.

\(x = \cos x\) with a naive fixed point iteration)

(SolvingWe have seen that *one* way to convert the example
\(f(x) = x - \cos x = 0\) to a fixed point iteration is \(g(x) = \cos x\),
and that this is a contraction on \(D = [-1, 1]\)

Here is what this iteration looks like:

```
a = 0
b = 1
x = np.linspace(a, b)
iterations = 10
# Start at left
print(f"Solving x = cos(x) starting to the left, at x_0 = {a}")
x_k = a
figure(figsize=(8,8))
title(f"Solving $x = \cos x$ starting to the left, at $x_0$ = {a}")
plot(x, x, "g")
plot(x, g(x), "r")
grid(True)
for k in range(iterations):
g_x_k = g_1(x_k)
# Graph evalation of g(x_k) from x_k:
plot([x_k, x_k], [x_k, g(x_k)], "b")
x_k_plus_1 = g_1(x_k)
#Connect to the new x_k on the line y = x:
plot([x_k, g_1(x_k)], [x_k_plus_1, x_k_plus_1], "b")
# Update names: the old x_k+1 is the new x_k
x_k = x_k_plus_1
print(f"x_{k+1} = {x_k_plus_1}")
# Start at right
print(f"Solving x = cos(x) starting to the right, at x_0 = {b}")
x_k = b
figure(figsize=(8,8))
title(f"Solving $x = \cos(x)$ starting to the right, at $x_0$ = {b}")
plot(x, x, "g")
plot(x, g(x), "r")
grid(True)
for k in range(iterations):
g_x_k = g_1(x_k)
# Graph evalation of g(x_k) from x_k:
plot([x_k, x_k], [x_k, g(x_k)], "b")
x_k_plus_1 = g_1(x_k)
#Connect to the new x_k on the line y = x:
plot([x_k, g_1(x_k)], [x_k_plus_1, x_k_plus_1], "b")
# Update names: the old x_k+1 is the new x_k
x_k = x_k_plus_1
print(f"x_{k+1} = {x_k_plus_1}")
```

```
Solving x = cos(x) starting to the left, at x_0 = 0
x_1 = 1.0
x_2 = 0.5403023058681398
x_3 = 0.8575532158463933
x_4 = 0.6542897904977792
x_5 = 0.7934803587425655
x_6 = 0.7013687736227566
x_7 = 0.7639596829006542
x_8 = 0.7221024250267077
x_9 = 0.7504177617637605
x_10 = 0.7314040424225098
Solving x = cos(x) starting to the right, at x_0 = 1
x_1 = 0.5403023058681398
x_2 = 0.8575532158463933
x_3 = 0.6542897904977792
x_4 = 0.7934803587425655
x_5 = 0.7013687736227566
x_6 = 0.7639596829006542
x_7 = 0.7221024250267077
x_8 = 0.7504177617637605
x_9 = 0.7314040424225098
x_10 = 0.7442373549005569
```

In each case, one gets a “box spiral” in to the fixed point.
It always looks like this when \(g\) is *decreasing* near the fixed point.

If instead \(g\) is *increasing* near the fixed point, the iterates approach monotonically, either from above or below:

\(f(x) = x^2 - 5x + 4 = 0\) in interval \([0, 3]\))

(SolvingThe roots are 1 and 4; for now we aim at the first of these, so we chose a domain \([0, 3]\) that contains just this root.

Let us get a fixed point form by “partially solving for \(x\)”: solving for the \(x\) in the \(5 x\) term:

```
def f_2(x): return x**2 - 5*x + 4
def g_2(x): return (x**2 + 4)/5
```

```
a = 0
b = 3
x = np.linspace(a, b)
figure(figsize=(12,5))
title("$y = f_2(x) = x^2-5x+4$ and $y = 0$")
plot(x, f_2(x))
plot([a, b], [0, 0])
grid(True)
figure(figsize=(12,5))
title("$y = g_2(x) = (x^2 + 4)/5$ and $y=x$")
plot(x, g_2(x))
plot(x, x)
grid(True)
```

```
iterations = 10
# Start at left
a = 0.0
b = 2.0
x = np.linspace(a, b)
```

```
x_k = a
figure(figsize=(8,8))
title(f"Starting to the left, at x_0 = {a}")
grid(True)
plot(x, x, "g")
plot(x, g_2(x), "r")
for k in range(iterations):
g_x_k = g_2(x_k)
# Graph evalation of g(x_k) from x_k:
plot([x_k, x_k], [x_k, g_2(x_k)], "b")
x_k_plus_1 = g_2(x_k)
#Connect to the new x_k on the line y = x:
plot([x_k, g_2(x_k)], [x_k_plus_1, x_k_plus_1], "b")
# Update names: the old x_k+1 is the new x_k
x_k = x_k_plus_1
print(f"x_{k+1} = {x_k_plus_1}")
```

```
x_1 = 0.8
x_2 = 0.9280000000000002
x_3 = 0.9722368000000001
x_4 = 0.9890488790548482
x_5 = 0.9956435370319303
x_6 = 0.9982612105666906
x_7 = 0.9993050889044148
x_8 = 0.999722132142052
x_9 = 0.9998888682989302
x_10 = 0.999955549789623
```

```
# Start at right
a = 0.0
b = 2.0
x = np.linspace(a, b)
x_k = b
figure(figsize=(8,8))
title(f"Starting to the right, at x_0 = {b}")
grid(True)
plot(x, x, "g")
plot(x, g_2(x), "r")
for k in range(iterations):
g_x_k = g_2(x_k)
# Graph evalation of g(x_k) from x_k:
plot([x_k, x_k], [x_k, g_2(x_k)], "b")
x_k_plus_1 = g_2(x_k)
#Connect to the new x_k on the line y = x:
plot([x_k, g_2(x_k)], [x_k_plus_1, x_k_plus_1], "b")
# Update names: the old x_k+1 is the new x_k
x_k = x_k_plus_1
print(f"x_{k+1} = {x_k_plus_1}")
```

```
x_1 = 1.6
x_2 = 1.312
x_3 = 1.1442688
x_4 = 1.061870217330688
x_5 = 1.0255136716907844
x_6 = 1.010335658164943
x_7 = 1.0041556284319177
x_8 = 1.0016657052223
x_9 = 1.0006668370036975
x_10 = 1.000266823735797
```

## 1.2.7. Exercises#

### Exercise 1#

The equation \(x^3 -2x + 1 = 0\) can be written as a fixed point equation in many ways, including

\(\displaystyle x = \frac{x^3 + 1}{2}\)

and\(x = \sqrt[3]{2x-1}\)

For each of these options:

(a) Verify that its fixed points do in fact solve the above cubic equation.

(b) Determine whether fixed point iteration with it will converge to the solution \(r=1\). (assuming a “good enough” initial approximation).

**Note:** computational experiments can be a useful start, but prove your answers mathematically!