**References:**

- Sections 1.2
*Fixed-Point Iteration*and 1.4*Newton's Method*of {cite}`Sauer`

- Sections 2.2
*Fixed-Point Iteration*and 2.3*Newton's Method and Its Extensions*of {cite}`Burden-Faires`

Newton's method for solving equations has a number of advantages over the bisection method:

- It is usually faster (but not always, and it can even fail completely!)
- It can also compute complex roots, such as the non-real roots of polynomial equations.
- It can even be adapted to solving systems of non-linear equations; that topic wil be visited later.

In [1]:

```
# We will often need resources from the modules numpy and pyplot:
import numpy as np
from numpy import sin, cos
from matplotlib.pyplot import figure, plot, title, legend, grid
```

You might have previously seen Newton's method derived using tangent line approximations. That derivation is presented below, but first we approach it another way: as a particularly nice contraction mapping.

To compute a root $r$ of a differentiable function $f$, we design a contraction mapping for which the contraction constant $C$ becomes arbitrarily small when we restrict to iterations in a sufficiently small interval around the root: $|x - r| \leq R$.

That is, the error ratio $|E_{k+1}|/|E_k|$ becomes ever smaller as the iterations get closer to the exact solution; the error is thus reducing ever faster than the geometric rate $C^k$ seen in the section on fixed point iteration.

This effect is in turn achieved by getting $|g'(x)|$ arbitrarily small for $|x - r| \leq R$ with $R$ small enough,
and then using the connection between $g'(x)$ and $C$ given in {prf:ref}`a-derivative-based-fixed-point-theorem`

.
This can be achieved by ensuring that $g'(r) = 0$ at a root $r$ of $f$ — so long as thr root $r$ is *simple*:
$f'(r) \neq 0$ (which is generically true, but not always).

To do so, seek $g$ in the above form $g(x) = x - w(x)f(x)$, and choose $w(x)$ appropriately. At the root $r$,

$$g'(r) = 1 - w'(r)f(r) - w(r)f'(r) = 1 - w(r)f'(r) \quad\text{(using } f(r) = 0,)$$so we ensure $g'(r) = 0$ by requiring $w(r) = 1/f'(r)$ (hence the problem if $f'(r) = 0$).

We do not know $r$, but that does not matter! We can just choose $w(x) = 1/f'(x)$ for all $x$ values. That gives

$$g(x) = x - {f(x)}/{f'(x)}$$and thus the iteration formula

$$x_{k+1} = x_k - {f(x_k)}/{f'(x_k)}$$(That is, $g(x) = x - {f(x)}/{f'(x)}$.)

You might recognize this as the formula for Newton's method.

To explore some examples of this, here is a Python function implementing this method.

In [2]:

```
def newtonMethod(f, Df, x0, errorTolerance, maxIterations=20, demoMode=False):
"""Basic usage is:
(rootApproximation, errorEstimate, iterations) = newtonMethod(f, Df, x0, errorTolerance)
There is an optional input parameter "demoMode" which controls whether to
- print intermediate results (for "study" purposes), or to
- work silently (for "production" use).
The default is silence.
"""
if demoMode: print("Solving by Newton's Method.")
x = x0
for k in range(maxIterations):
fx = f(x)
Dfx = Df(x)
# Note: a careful, robust code would check for the possibility of division by zero here,
# but for now I just want a simple presentation of the basic mathematical idea.
dx = fx/Dfx
x -= dx # Aside: this is shorthand for "x = x - dx"
errorEstimate = abs(dx)
if demoMode:
print(f"At iteration {k+1} x = {x} with estimated error {errorEstimate:0.3}, backward error {abs(f(x)):0.3}")
if errorEstimate <= errorTolerance:
iterations = k
return (x, errorEstimate, iterations)
# If we get here, it did not achieve the accuracy target:
iterations = k
return (x, errorEstimate, iterations)
```

```{prf:remark} A Python module :label: module-numerical-methods

From now on, all functions like this that implement numerical methods are also collected in the module file `numericalMethods.py`

Thus, you could omit the above `def`

and instead import `newtonMethod`

with

```
from numericalMethods import newtonMethod
```

```

```
{prf:example}
:label: example-newton-x-cosx
Let's start with our favorite equation, $x = \cos x$.
```

```{prf:remark} On Python style :label: remark-python-style

Since function names in Python (and most programming languages) must be alpha-numeric
(with the underscore `_`

as a "special guest letter"),
I will avoid primes in notation for derivatives as much as possible:
from now on, the derivative of $f$ is most often denoted as $Df$ rather than $f'$.
```

In [3]:

```
def f_1(x): return x - cos(x)
def Df_1(x): return 1. + sin(x)
```

In [4]:

```
(root, errorEstimate, iterations) = newtonMethod(f_1, Df_1, x0=0., errorTolerance=1e-8, demoMode=True)
print()
print(f"The root is approximately {root}")
print(f"The estimated absolute error is {errorEstimate:0.3}")
print(f"The backward error is {abs(f_1(root)):0.3}")
print(f"This required {iterations} iterations")
```

```{prf:definition} Backward Error :label: backward-error

The

**backward error**in $\tilde x$ as an approximation to a root of a function $f$ is $f(\tilde x)$.The

**absolute backward**error is its absolute value, $|f(\tilde x)|$. However sometimes the latter is simply called the backward error — as the above code does. ```

This has the advantage that we can actually compute it without knowing the exact solution!

The backward error also has a useful geometrical meaning: if the function $f$ were changed by this much to a nearbly function $\tilde f$ then $\tilde x$ could be an exact root of $\tilde f$. Hence, if we only know the values of $f$ to within this backward error (for example due to rounding error in evaluating the function) then $\tilde x$ could well be an exact root, so there is no point in striving for greater accuracy in the approximate root.

We will see this in the next example.

Since this is a fixed point iteration with $g(x) = x - (x - \cos(x)/(1 + \sin(x))$, let us compare its graph to the ones seen in the section on fixed point iteration. Now $g$ is neither increasing nor decreasing at the fixed point, so the graph has an unusual form.

In [5]:

```
def g(x):
return x - (x - cos(x))/(1 + sin(x))
a = 0
b = 1
# An array of x values for graphing
x = np.linspace(a, b)
iterations = 4 # Not so many are needed now!
```

In [6]:

```
# Start at left
description = "Starting near the left end of the domain"
print(description)
x_k = 0.1
print(f"x_0 = {x_k}")
figure(figsize=(8,8))
title(description)
grid(True)
plot(x, x, "g")
plot(x, g(x), "r")
for k in range(iterations):
g_x_k = g(x_k)
# Graph evalation of g(x_k) from x_k:
plot([x_k, x_k], [x_k, g(x_k)], "b")
x_k_plus_1 = g(x_k)
#Connect to the new x_k on the line y = x:
plot([x_k, g(x_k)], [x_k_plus_1, x_k_plus_1], "b")
# Update names: the old x_k+1 is the new x_k
x_k = x_k_plus_1
print(f"x_{k+1} = {x_k}")
```

In [7]:

```
# Start at right
description = "Starting near the right end of the domain"
print(description)
x_k = 0.9
print(f"x_0 = {x_k}")
figure(figsize=(8,8))
title(description)
grid(True)
plot(x, x, "g")
plot(x, g(x), "r")
for k in range(iterations):
g_x_k = g(x_k)
# Graph evalation of g(x_k) from x_k:
plot([x_k, x_k], [x_k, g(x_k)], "b")
x_k_plus_1 = g(x_k)
#Connect to the new x_k on the line y = x:
plot([x_k, g(x_k)], [x_k_plus_1, x_k_plus_1], "b")
# Update names: the old x_k+1 is the new x_k
x_k = x_k_plus_1
print(f"x_{k+1} = {x_k}")
```

```{prf:example} Pushing to the limits of standard 64-bit computer arithmetic :label: example-2

Next, demand more accuracy; this time silently. As we will see in a later section, $10^{-16}$ is about the limit of the precision of standard (IEE64) computer arithmetic with 64-bit numbers.

So let's try to compute the root as accurately as we can within these limits: ```

In [8]:

```
(root, errorEstimate, iterations) = newtonMethod(f_1, Df_1, x0=0, errorTolerance=1e-16)
print()
print(f"The root is approximately {root}")
print(f"The estimated absolute error is {errorEstimate}")
print(f"The backward error is {abs(f_1(root)):0.4}")
print(f"This required {iterations} iterations")
```

Observations:

- It only took one more iteration to meet the demand for twice as many decimal places of accuracy.
- The result is "exact" as fas as the computer arithmeric can tell, as shown by the zero backward error: we have indeed reached the accuracy limits of computer arithmetic.

We will work almost entirely with real values and vectors in $\mathbb{R}^n$, but actually, everything above also works for complex numbers. In particular, Newton's method works for finding roots of functions $f:\mathbb{C} \to \mathbb{C}$; for example when seeking all roots of a polynomial.

```{prf:remark} Notation for complex number in Python :label: python-complex-numbers

Python uses `j`

for the square root of -1 (as is also sometimes done in engineering) rather than `i`

.

In general, the complex number $a + b i$ is expressed as `a+bj`

(note: `j`

at the end, and no spaces).
As you might expect, imaginary numbers can be written without the $a$, as `bj`

.

However, the coefficient `b`

is always needed, even when $b=1$: the square roots of -1 are `1j`

and `-1j`

,
not `j`

and `-j`

, and the latter pair still refer to a variable `j`

and its negation.
```

In [9]:

```
z = 3+4j
print(z)
print(abs(z))
```

(3+4j) 5.0

In [10]:

```
print(1j)
```

1j

In [11]:

```
print(-1j)
```

(-0-1j)

but:

In [12]:

```
print(j)
```

Giving `j`

a value does not interfere:

In [13]:

```
j = 100
```

In [14]:

```
print(1j)
```

1j

In [15]:

```
print(j)
```

100

```{prf:example} All roots of a cubic :label: example-3

As an example, let us seek all three cube roots of 8, by solving $x^3 - 8 = 0$ and trying different initial values $x_0$. ```

In [16]:

```
def f_2(x): return x**3 - 8
def Df_2(x): return 3*x**2
```

*First, $x_0 = 1$*

In [17]:

```
(root1, errorEstimate1, iterations1) = newtonMethod(f_2, Df_2, x0=1., errorTolerance=1e-8, demoMode=True)
print()
print(f"The first root is approximately {root1}")
print(f"The estimated absolute error is {errorEstimate1}")
print(f"The backward error is {abs(f_2(root1)):0.4}")
print(f"This required {iterations1} iterations")
```

*Next, start at $x_0 = i$ (a.k.a. $x_0 = j$):*

In [18]:

```
(root2, errorEstimate2, iterations2) = newtonMethod(f_2, Df_2, x0=1j, errorTolerance=1e-8, demoMode=True)
print()
print(f"The second root is approximately {root2}")
print(f"The estimated absolute error is {errorEstimate2:0.3}")
print(f"The backward error is {abs(f_2(root2)):0.3}")
print(f"This required {iterations2} iterations")
```

This root is in fact $-1 + i \sqrt{3}$.

*Finally, $x_0 = 1 - i$*

In [19]:

```
(root3, errorEstimate3, iterations3) = newtonMethod(f_2, Df_2, x0=1-1j, errorTolerance=1e-8, demoMode=False)
print()
print(f"The third root is approximately {root3}")
print(f"The estimated absolute error is {errorEstimate3}")
print(f"The backward error is {abs(f_2(root3)):0.4}")
print(f"This required {iterations3} iterations")
```

This root is in fact $-1 - i \sqrt{3}$.

The more traditional derivation of Newton's method is based on the very widely useful idea of *linearization*;
using the fact that a differentiable function can be approximated over a small part of its domain by a straight line — its tangent line — and it is easy to compute the root of this linear function.

So start with a first approximation $x_0$ to a solution $r$ of $f(x) = 0$.

The tangent line to the graph of this function wih center $x_0$,
also know as the *linearization of $f$ at $x_0$*, is

(Note that $L_0(x_0) = f(x_0)$ and $L_0'(x_0) = f'(x_0)$.)

Hopefully, the two functions $f$ and $L_0$ are close, so that the root of $L_0$ is close to a root of $f$; close enough to be a better approximation of the root $r$ than $x_0$ is.

Give the name $x_1$ to this root of $L_0$: it solves $L_0(x_1) = f(x_0) + f'(x_0) (x_1 - x_0) = 0$, so

$$x_1 = x_0 - {f(x_0)}/{f'(x_0)}$$We can then use this new value $x_1$ as the center for a new linearization $L_1(x) = f(x_1) + f'(x_1)(x - x_1)$, and repeat to get a hopefully even better approximate root,

$$x_2 = x_1 - f(x_1)/f'(x_1)$$And so on: at each step, we get from approximation $x_k$ to a new one $x_{k+1}$ with

$$x_{k+1} = x_k - {f(x_k)}/{f'(x_k)}$$And indeed this is the same formula seen above for Newton's method.

**Illustration: a few steps of Newton's method for $x - \cos(x) = 0$.**

This approach to Newton's method via linearization and tangent lines suggests another graphical presentation; again we use the example of $f(x) = x - \cos (x)$. This has $Df(x) = 1 + \sin(x)$, so the linearization at center $a$ is

$$L(x) = (a - \cos(a)) + (1 + \sin(a))(x-a)$$For Newton's method starting at $x_0 = 0$, this gives

$$L_0(x) = -1 + x$$and its root — the next iterate in Newton's method — is $x_1 = 1$

Then the linearization at center $x_1$ is

$$ L_1(x) = (1 - \cos(1) + (1 + \sin(1))(x-1), \approx 0.4596 + 1.8415(x-1) $$giving $x_2 \approx 1 - 0.4596/1.8415 \approx 0.7504$.

Let's graph a few steps.

In [20]:

```
def L_0(x): return -1 + x
```

In [21]:

```
figure(figsize=(12,6))
title("First iteration, from $x_0 = 0$")
left = -0.1
right = 1.1
x = np.linspace(left, right)
plot(x, f_1(x), label="$x - \cos(x)$")
plot([left, right], [0, 0], "k", label="$x=0$") # The x-axis, in black
x_0 = 0
plot([x_0], [f_1(x_0)], "g*")
plot(x, L_0(x), "y", label="$L_0(x)$")
plot([x_0], [f_1(x_0)], "g*")
x_1 = x_0 - f_1(x_0)/Df_1(x_0)
print(f"{x_1=}")
plot([x_1], [0], "r*")
legend()
grid(True);
```

x_1=1.0

In [22]:

```
def L_1(x): return (x_1 - cos(x_1)) + (1 + sin(x_1))*(x - x_1)
figure(figsize=(12,6))
title("Second iteration, from $x_1 = 1$")
# Shrink the domain
left = 0.7
right = 1.05
x = np.linspace(left, right)
plot(x, f_1(x), label="$x - \cos(x)$")
plot([left, right], [0, 0], "k", label="$x=0$") # The x-axis, in black
plot([x_1], [f_1(x_1)], "g*")
plot(x, L_1(x), "y", label="$L_1(x)$")
x_2 = x_1 - f_1(x_1)/Df_1(x_1)
print(f"{x_2=}")
plot([x_2], [0], "r*")
legend()
grid(True);
```

x_2=0.7503638678402439

In [23]:

```
def L_2(x): return (x_2 - cos(x_2)) + (1 + sin(x_2))*(x - x_2)
figure(figsize=(12,6))
title("Third iteration, from $x_2$")
# Shrink the domain some more
left = 0.735
right = 0.755
x = np.linspace(left, right)
plot(x, f_1(x), label="$x - \cos(x)$")
plot([left, right], [0, 0], "k", label="$x=0$") # The x-axis, in black
plot([x_2], [f_1(x_2)], "g*")
plot(x, L_2(x), "y", label="$L_2(x)$")
x_3 = x_2 - f_1(x_2)/Df_1(x_2)
plot([x_3], [0], "r*")
legend()
grid(True);
```

For the bisection method, we have seen in the section on root-finding by interval halving a fairly simple way to get an upper limit on the absolute error in the approximations.

For absolute guarantees of accuracy, things do not go quite as well for Newton's method,
but we can at least get a very "probable" *estimate* of how large the error can be.
This requires some calculus, and more specifically Taylor's theorem,
reviewed in the section on Taylor's theorem.

On the other hand, the example graphs above illustrate that the successive linearizations become ever more accurate as approximations of the function $f$ itself, so that the approximation $x_3$ looks "perfect" on the graph — the speed of Newton's method looks far better than for bisection.

we will return to the question of both the speed and accuracy of Newton's method in section on the convergence rate of Newton's method.

a) Show that Newton's method applied to

$$ f(x) = x^k - a $$leads to fixed point iteration with function

$$ g(x) = \frac{(k-1) x + \displaystyle \frac{a}{x^{k-1}}}{k}. $$b) Then verify mathematically that the iteration $x_{k+1} = g(x_k)$ has super-linear convergence.

a) Create a Python function for Newton's method, with usage

```
(root, errorEstimate, iterations, functionEvaluations) = newtonMethod(f, Df, x_0, errorTolerance, maxIterations)
```

(The last input parameter `maxIterations`

could be optional, with a default like `maxIterations=100`

.)

b) based on your function `bisection2`

create a third (and final!) version with usage

```
(root, errorBound, iterations, functionEvaluations) = bisection(f, a, b, errorTolerance, maxIterations)
```

c) Use both of these to solve the equation

$$ f_1(x) = 10 - 2x + \sin(x) = 0 $$i) with (estimated) absolute error of no more than $10^{-6}$, and then

ii) with (estimated) absolute error of no more than $10^{-15}$.

Note in particular how many iterations and how many function evaluations are needed.

Graph the function, which will help to find a good starting interval $[a, b]$ and initial approximation $x_0$.

d) Repeat, this time finding the unique real root of

$$ f_2(x) = x^3 - 3.3 x^2 + 3.63 x - 1.331 = 0 $$Again graph the function, to find a good starting interval $[a, b]$ and initial approximation $x_0$.

e) This second case will behave differently than for $f_1$ in part (c): describe the difference. (We will discuss the reasons in class.)