(section:fixed-point-iteration)=

Last revised on August 14, 2024

**References:**

- Section 1.2
*Fixed-Point Iteration*of {cite}`Sauer`

- Section 2.2
*Fixed-Point Iteration*of {cite}`Burden-Faires`

- Section ??? of {cite}
`Chenney-Kincaid`

(coming soon)

In the next section we will meet Newton's method for root-finding,
which you might have seen in a calculus course.
This is one very important example of the more general strategy of **fixed-point iteration**, so we start with that.

In [1]:

```
# We will often need resources from the modules numpy and pyplot:
import numpy as np
# We can also import items from a module individually, so they can be used by "first name only".
# In this book this is done mostly for mathematical functions with familiar names:
from numpy import cos
# and some very frequently used functions for graphing:
from matplotlib.pyplot import figure, plot, title, legend, grid
```

A variant of stating equations as *root-finding* ($f(x) = 0$) is *fixed-point* form:
given a function $g:\mathbb{R} \to \mathbb{R}$ or $g:\mathbb{C} \to \mathbb{C}$
(or even $g:\mathbb{R}^n \to \mathbb{R}^n$; a later topic),
find a *fixed point* of $g$.
That is, a value $p$ for its argument such that

Such problems are interchangeable with root-finding. One way to convert from $f(x) = 0$ to $g(x) = x$ is defining

$$g(x) := x - w(x) f(x)$$for any "weight function" $w(x)$.

One can convert the other way too, for example defining $f(x) := g(x) - x$. We have already seen this when we converted the equation $x = \cos x$ to $f(x) = x - \cos x = 0$.

In [25]:

```
def f_1(x): return x - cos(x)
def g_1(x): return cos(x)
```

In [26]:

```
a = -1
b = 1
x = np.linspace(a, b)
figure(figsize=(12,5))
title("$y = f_1(x) = x - \cos(x)$ and $y=0$")
plot(x, f_1(x))
plot([a, b], [0, 0])
grid(True)
figure(figsize=(12,5))
title("$y = g_1(x) = \cos (x)$ and $y=x$")
plot(x, g_1(x))
plot(x, x)
grid(True)
```

Fixed point form can be convenient partly because we almost always have to solve by successive approximations,
or *iteration*, and fixed point form suggests *one* choice of iterative procedure:
start with any first approximation $x_0$, and iterate with

```
{prf:proposition}
:label: proposition-1-fpi-iterates-converge-to-fp
If $g$ is continuous, and **if** the above sequence $\{x_0, x_1, \dots \}$ converges to a limit $p$, then that limit is a fixed point of function $g$: $g(p) = p$.
```

```
{prf:proof}
From $\displaystyle \lim_{k \to \infty} x_k = p$, continuity gives
$$\lim_{k \to \infty} g(x_k) = g(p).$$
On the other hand, $g(x_k) = x_{k+1}$, so
$$\lim_{k \to \infty} g(x_k) = \lim_{k \to \infty} x_{k+1} = p.$$
Comparing gives $g(p) = p$.
```

That second "if" is a big one;
Fortunately, it can often be resolved using the idea of a *contraction mapping*.

First, we need the idea of a **mapping**.

```{prf:definition} Mapping :label: definition-mapping

A function $g(x)$ defined on a closed interval $D = [a, b]$ which sends values back into that interval,
$g: D \to D$, is sometimes called a **map** or **mapping**.
```

```
{prf:proposition}
:label: proposition-2-ivp-fpi-version
Any continuous mapping on a closed interval $[a, b]$ has at least one fixed point.
```

```
{prf:proof}
Consider the "root-finding cousin", $f(x) = x - g(x)$.
First, $f(a) = a - g(a) \leq 0$, since $g(a) \geq a$ so as to be in the domain $[a,b]$ — similarly, $f(b) = b - g(b) \geq 0$.
From the Intermediate Value Theorem, $f$ has a zero $p$, where $f(p) = p - g(p) = 0$.
In other words, the graph of $y=g(x)$ goes from being above the line $y=x$ at $x=a$ to below it at $x=b$,
so at some point $x=p$, the curves meet: $y = x = p$ and $y = g(p)$, so $p = g(p)$.
```

```
{prf:example}
:label: example-1-x-4cosx
Let us illustrate this with the mapping $g_4(x) := 4 \cos x$,
for which the fact that $|g_4(x)| \leq 4$ ensures that this is a map of the domain $D = [-4, 4]$ into itself:
```

In [27]:

```
def g_4(x): return 4*cos(x)
a = -4
b = 4
x = np.linspace(a, b)
figure(figsize=(8,8))
title("Fixed points of the map $g_4(x) = 4 \cos(x)$")
plot(x, g_4(x), label="$y=g_4(x)$")
plot(x, x, label="$y=x$")
legend()
grid(True);
```

**unique** solution, and convergence of the iteration to that solution, we use an extra condition.

```{prf:definition} Contraction Mapping :label: definition-contraction-mapping

A mapping $g:D \to D$, is called a **contraction** or **contraction mapping** if there is a constant $C < 1$ such that

for any $x$ and $y$ in $D$.
We then call $C$ a *contraction constant*.
```

```
{prf:remark}
- It is not enough to have $| g(x) - g(y) | < | x - y |$ or $C = 1$!
We need the ratio
$\displaystyle \frac{|g(x) - g(y)|}{|x - y|}$
to be *uniformly* less than one for all possible values of $x$ and $y$.
- The same applies for a domain in $\mathbb{C}$, $\mathbb{R}^n$ or $\mathbb{C}^n$:
for vectors, just replace the absolute value $| \dots |$ by the vector norm $\| \dots \|$.
In fact, we will sometimes blur the distinction by using the "single line" absolute value notation for vector norms too.
```

```{prf:theorem} A Contraction Mapping Theorem :label: a-contraction-mapping-theorem

Any contraction mapping on a closed, bounded interval $D = [a, b]$ has exactly one fixed point $p$ in $D$.
Further, this can be calculated as the limit $\displaystyle p = \lim_{k \to \infty} x_k$ of the iteration sequence given by $x_{k+1} = g(x_{k})$ for *any* choice of the starting point $x_{0} \in D$.
```

```
{prf:proof}
First, uniqeness:
between any two of the multiple fixed points above — call them $p_0$ and $p_1$ — the graph of $g(x)$ has to rise with secant slope 1: $(g(p_1) - g(p_0)/(p_1 - p_0) = (p_1 - p_0)/(p_1 - p_0) = 1$, and this violates the contraction property.
So instead, for a contraction, the graph of a contraction map looks like the one below for our favorite example,
$g(x) = \cos x$ (which we will soon verify to be a contraction on interval $[-1, 1]$):
The second claim, about **convergence** to the fixed point from any initial approximation $x_0$,
will be verified below, once we have seen some ideas about measuring errors.
```

With differentiable functions, the contraction condition can often be easily verified using derivatives:

```{prf:theorem} A derivative-based fixed point theorem :label: a-derivative-based-fixed-point-theorem

If a function $g:[a,b] \to [a,b]$ is differentiable and there is a constant $C < 1$ such that $|g'(x)| \leq C$ for all $x \in [a, b]$, then $g$ is a contraction mapping with this contraction constant, and so has a unique fixed point in this interval. ```

```
{prf:proof}
Using the Mean Value Theorem, $g(x) - g(y) = g'(c)(x - y)$ for some $c$ between $x$ and $y$.
Then taking absolute values,
$$|g(x) - g(y)| = |g'(c)| \cdot |(x - y)| \leq C |(x - y)|.$$
```

```{prf:example} $g(x) = \cos(x)$ is a contraction on interval $[-1,1]$ :label: example-2-x-cosx

Our favorite example $g(x) = \cos(x)$ is a contraction, but we have to be a bit careful about the domain.

For all real $x$, $g'(x) = -\sin x$, so $|g'(x)| \leq 1$; this is almost but not quite enough.

However, we have seen that iteration values will settle in the interval $D = [-1,1]$, and considering $g$ as a mapping of this domain, $|g'(x)| \leq \sin(1) = 0.841\dots < 1$: that is, now we have a contraction, with $C = \sin(1) \approx 0.841$.

And as seen in the graph above, there is indeed a unique fixed point. ```

It can be shown that if $C$ is small (at least when one looks only at a reduced domain $|x - p| < R$) then the convergence is "fast" once $|x_k - p| < R$.

To see this, we define some jargon for talking about errors. (For more details on error concepts, see the section on error measures and convergence rates.)

```{prf:definition} Error :label: definition-error

The **error** in $\tilde x$ as an approximation to an exact value $x$ is

This will often be abbreviated as $E$. ```

```{prf:definition} Absolute Error :label: definition-absolute-error

The **absolute error** in $\tilde x$ an approximation to an exact value $x$ is the magnitude of the error:
the absolute value $|E| = |\tilde x - x|$.

```
```{prf:remark}
This will later be extended to $x$ and $\tilde x$ being vectors,
by again using the vector norm in place of the absolute value.
```

```
{prf:proposition}
:label: proposition-3
$|E_{k+1}| \leq C |E_{k}|$, or $|E_{k+1}|/|E_{k}|\leq C$,
and so
$$|E_k| \leq C^k |x_0 - p|$$
That is, the error decreases at worst in a geometric sequence,
which is exponential decrease with respect to the variable $k$.
```

```
{prf:proof}
$E_{k+1} = x_{k+1} - p = g(x_{k}) - g(p)$, using $g(p) = p$.
Thus the contraction property gives
$$|E_{k+1}| = |g(x_k) - g(p)| \leq C |x_k - p| = C |E_k|$$
Applying this again,
$$|E_k| \leq C |E_{k-1}| \leq C \cdot C |E_{k-2}| = C^2 |E_{k-2}|$$
and repeating $k-2$ more times,
$$|E_k|\leq C^k |E_0| = C^k |x_0 - p|.$$
```

```
{prf:remark}
We will often use this "recursive" strategy of relating the error in one iterate to that in the previous iterate.
```

```{prf:example} Solving $x = \cos x$ with a naive fixed point iteration :label: example-3-x-cosx-fpi

We have seen that *one* way to convert the example
$f(x) = x - \cos x = 0$ to a fixed point iteration is $g(x) = \cos x$,
and that this is a contraction on $D = [-1, 1]$

Here is what this iteration looks like: ```

In [28]:

```
a = 0
b = 1
x = np.linspace(a, b)
iterations = 10
# Start at left
print(f"Solving x = cos(x) starting to the left, at x_0 = {a}")
x_k = a
figure(figsize=(8,8))
title(f"Solving $x = \cos x$ starting to the left, at $x_0$ = {a}")
plot(x, x, "g")
plot(x, g(x), "r")
grid(True)
for k in range(iterations):
g_x_k = g_1(x_k)
# Graph evalation of g(x_k) from x_k:
plot([x_k, x_k], [x_k, g(x_k)], "b")
x_k_plus_1 = g_1(x_k)
#Connect to the new x_k on the line y = x:
plot([x_k, g_1(x_k)], [x_k_plus_1, x_k_plus_1], "b")
# Update names: the old x_k+1 is the new x_k
x_k = x_k_plus_1
print(f"x_{k+1} = {x_k_plus_1}")
# Start at right
print(f"Solving x = cos(x) starting to the right, at x_0 = {b}")
x_k = b
figure(figsize=(8,8))
title(f"Solving $x = \cos(x)$ starting to the right, at $x_0$ = {b}")
plot(x, x, "g")
plot(x, g(x), "r")
grid(True)
for k in range(iterations):
g_x_k = g_1(x_k)
# Graph evalation of g(x_k) from x_k:
plot([x_k, x_k], [x_k, g(x_k)], "b")
x_k_plus_1 = g_1(x_k)
#Connect to the new x_k on the line y = x:
plot([x_k, g_1(x_k)], [x_k_plus_1, x_k_plus_1], "b")
# Update names: the old x_k+1 is the new x_k
x_k = x_k_plus_1
print(f"x_{k+1} = {x_k_plus_1}")
```

In each case, one gets a "box spiral" in to the fixed point.
It always looks like this when $g$ is *decreasing* near the fixed point.

If instead $g$ is *increasing* near the fixed point, the iterates approach monotonically, either from above or below:

```{prf:example} Solving $f(x) = x^2 - 5x + 4 = 0$ in interval $[0, 3]$ :label: example-4

The roots are 1 and 4; for now we aim at the first of these, so we chose a domain $[0, 3]$ that contains just this root.

Let us get a fixed point form by "partially solving for $x$": solving for the $x$ in the $5 x$ term:

$$ x = g(x) = (x^2 + 4)/5 $$```

In [29]:

```
def f_2(x): return x**2 - 5*x + 4
def g_2(x): return (x**2 + 4)/5
```

In [30]:

```
a = 0
b = 3
x = np.linspace(a, b)
figure(figsize=(12,5))
title("$y = f_2(x) = x^2-5x+4$ and $y = 0$")
plot(x, f_2(x))
plot([a, b], [0, 0])
grid(True)
figure(figsize=(12,5))
title("$y = g_2(x) = (x^2 + 4)/5$ and $y=x$")
plot(x, g_2(x))
plot(x, x)
grid(True)
```

In [31]:

```
iterations = 10
# Start at left
a = 0.0
b = 2.0
x = np.linspace(a, b)
```

In [32]:

```
x_k = a
figure(figsize=(8,8))
title(f"Starting to the left, at x_0 = {a}")
grid(True)
plot(x, x, "g")
plot(x, g_2(x), "r")
for k in range(iterations):
g_x_k = g_2(x_k)
# Graph evalation of g(x_k) from x_k:
plot([x_k, x_k], [x_k, g_2(x_k)], "b")
x_k_plus_1 = g_2(x_k)
#Connect to the new x_k on the line y = x:
plot([x_k, g_2(x_k)], [x_k_plus_1, x_k_plus_1], "b")
# Update names: the old x_k+1 is the new x_k
x_k = x_k_plus_1
print(f"x_{k+1} = {x_k_plus_1}")
```

In [33]:

```
# Start at right
a = 0.0
b = 2.0
x = np.linspace(a, b)
x_k = b
figure(figsize=(8,8))
title(f"Starting to the right, at x_0 = {b}")
grid(True)
plot(x, x, "g")
plot(x, g_2(x), "r")
for k in range(iterations):
g_x_k = g_2(x_k)
# Graph evalation of g(x_k) from x_k:
plot([x_k, x_k], [x_k, g_2(x_k)], "b")
x_k_plus_1 = g_2(x_k)
#Connect to the new x_k on the line y = x:
plot([x_k, g_2(x_k)], [x_k_plus_1, x_k_plus_1], "b")
# Update names: the old x_k+1 is the new x_k
x_k = x_k_plus_1
print(f"x_{k+1} = {x_k_plus_1}")
```