1.7. Root-finding Without Derivatives#
Last revised on August 14, 2024
References:
Section 1.5.1 Secant Method and variants in [Sauer, 2022]
Section 2.3 Newton’s Method and it Extensions in [Burden et al., 2016]; just the later sub-sections, on The Secant Method and The Method of False Position).
Section 3.3 Secant Method of [Chenney and Kincaid, 2012]
1.7.1. Introduction#
We have already seen one method for solving \(f(x) = 0\) without needing to know any derivatives of \(f\): the Bisection Method, a.k.a. Interval Halving. However, we have also seen that that method is far slower then Newton’s Method.
Here we explore methods that are almost the best of both worlds: about as fast as Newton’s method but not needing derivatives.
The first of these is the Secant Method. Later in this course we will see how this has been merged with the Bisection Method and Polynomial Collocation to produce the current state-of-the-art approach, Brent’s method, which was only perfected in the 1960’s.
# We will often need resources from the modules numpy and pyplot:
import numpy as np
from numpy import abs, cos
# Since we do a lot of graphics in this section, some more short-hands:
from matplotlib.pyplot import figure, title, plot, xlabel, ylabel, grid, legend
from matplotlib.pyplot import show # This might be needed with MS WIndows
# Also, some from the module for this book:
#from numericalMethods import newtonMethod
1.7.2. Using Linear Approximation#
One quirk of the Bisection Method is that it only uses the sign of the values \(f(a)\) and \(f(b)\), not their magnitudes. If one of these is far smaller than the other, one might guess that the root is closer to that end of the interval. This leads to the idea of:
starting with an interval \([a, b]\) known to contain a zero of \(f\),
connecting the two points \((a, f(a))\) and \((b, f(b))\) with a straight line, and
finding the \(x\)-value \(c\) where this line crosses the \(x\)-axis. In the words, approximating the function by a secant line, in place of the tangent line used in Newton’s Method.
1.7.3. A flawed first attempt: the Method of False Position#
The next step requires some care. The first idea (from almost a millenium ago) was to use this new approximation \(c\) as done with bisection: check which of the intervals \([a, c]\) and \([c,b]\) has the sign change and use it as the new interval \([a, b]\); this is called The Method of False Position (or Regula Falsi, since the academic world used latin in those days.)
The secant line between \((a, f(a))\) and \((b, f(b))\) is
and its zero is at
This is easy to implement, and an example will show that it sort of works, but with a weakness that hampers it a bit:
def falsePosition(f, a, b, errorTolerance=1e-15, maxIterations=15, demoMode=False):
"""Solve f(x)=0 in the interval [a, b] by the Method of False Position.
This code also illustrates a few ideas that I encourage, such as:
- Avoiding infinite loops, by using for loops and the command "break"
- Avoiding repeated evaluation of the same quantity [the DRY doctrine: "Don't Repeat Yourself"]
- Use of descriptive variable names
- Use of "camelCase" to turn descriptive phrases into valid Python variable names
- An optional "demonstration mode" to display intermediate results.
"""
if demoMode: print(f"Solving by the Method of False Position.")
fa = f(a)
fb = f(b)
for iteration in range(maxIterations):
if demoMode: print(f"\nIteration {iteration}:")
c = (a * fb - fa * b)/(fb - fa)
fc = f(c)
if fa * fc < 0:
b = c
fb = fc # N.B. When b is updated, so must be fb = f(b)
else:
a = c
fa = fc
errorBound = b - a
if demoMode:
print(f"The root is in interval [{a}, {b}]")
print(f"The new approximation is {c}, with error bound {errorBound:0.4}, backward error {abs(fc):0.4}")
if errorBound < errorTolerance:
break
# Whether we got here due to accuracy of running out of iterations,
# return the information we have, including an error bound:
root = c # the newest value is probably the most accurate
return (root, errorBound)
For a more concise presentation, you could omit the above def
and instead import this function with
from numericalMethods import falsePosition
def f(x): return x - cos(x)
(root, errorBound) = falsePosition(f, a=-1, b=1, demoMode=True)
print(f"\nThe Method of False Position gave approximate root is {root},")
print(f"with estimate error {errorBound:0.4}, backward error {abs(f(root)):0.4}")
Solving by the Method of False Position.
Iteration 0:
The root is in interval [0.5403023058681398, 1]
The new approximation is 0.5403023058681398, with error bound 0.4597, backward error 0.3173
Iteration 1:
The root is in interval [0.7280103614676171, 1]
The new approximation is 0.7280103614676171, with error bound 0.272, backward error 0.01849
Iteration 2:
The root is in interval [0.7385270062423998, 1]
The new approximation is 0.7385270062423998, with error bound 0.2615, backward error 0.000934
Iteration 3:
The root is in interval [0.7390571666782676, 1]
The new approximation is 0.7390571666782676, with error bound 0.2609, backward error 4.68e-05
Iteration 4:
The root is in interval [0.7390837322783136, 1]
The new approximation is 0.7390837322783136, with error bound 0.2609, backward error 2.345e-06
Iteration 5:
The root is in interval [0.7390850630385933, 1]
The new approximation is 0.7390850630385933, with error bound 0.2609, backward error 1.174e-07
Iteration 6:
The root is in interval [0.7390851296998365, 1]
The new approximation is 0.7390851296998365, with error bound 0.2609, backward error 5.883e-09
Iteration 7:
The root is in interval [0.7390851330390691, 1]
The new approximation is 0.7390851330390691, with error bound 0.2609, backward error 2.947e-10
Iteration 8:
The root is in interval [0.7390851332063397, 1]
The new approximation is 0.7390851332063397, with error bound 0.2609, backward error 1.476e-11
Iteration 9:
The root is in interval [0.7390851332147188, 1]
The new approximation is 0.7390851332147188, with error bound 0.2609, backward error 7.394e-13
Iteration 10:
The root is in interval [0.7390851332151385, 1]
The new approximation is 0.7390851332151385, with error bound 0.2609, backward error 3.708e-14
Iteration 11:
The root is in interval [0.7390851332151596, 1]
The new approximation is 0.7390851332151596, with error bound 0.2609, backward error 1.887e-15
Iteration 12:
The root is in interval [0.7390851332151606, 1]
The new approximation is 0.7390851332151606, with error bound 0.2609, backward error 1.11e-16
Iteration 13:
The root is in interval [0.7390851332151607, 1]
The new approximation is 0.7390851332151607, with error bound 0.2609, backward error 0.0
Iteration 14:
The root is in interval [0.7390851332151607, 1]
The new approximation is 0.7390851332151607, with error bound 0.2609, backward error 0.0
The Method of False Position gave approximate root is 0.7390851332151607,
with estimate error 0.2609, backward error 0.0
The good news is that the approximations are approaching the zero reasonably fast — far faster than bisection — as indicated by the backward errors improving by a factor of better than ten at each iteration.
The bad news is that one end gets “stuck”, so the interval does not shrink on both sides, and the error bound stays large.
This behavior is generic: with function \(f\) of the same convexity on the interval \([a, b]\), the secant line will always cross on the same side of the zero, so that one end-point persists; in this case, the curve is concave up, so the secant line always crosses to the left of the root, as seen in the following graphs.
def graphFalsePosition(f, a, b, maxIterations=5):
"""Graph a few iterations of the Method of False Position for solving f(x)=0 in the interval [a, b].
"""
fa = f(a)
fb = f(b)
for iteration in range(maxIterations):
c = (a * fb - fa * b)/(fb - fa)
fc = f(c)
abc = [a,b, c]
left = np.min(abc)
right = np.max(abc)
x = np.linspace(left, right)
figure(figsize=[12,5])
title(f"Iteration {iteration+1}, Method of False Position")
xlabel("$x$")
plot(x, f(x))
plot([left, right], [f(left), f(right)]) # the secant line
plot([left, right], [0, 0], 'k') # the x-axis line
plot(abc, f(abc), 'r*')
#show() # The Windows version of JupytLab might need this command
if fa * fc < 0:
b = c
fb = fc # N.B. When b is updated, so must be fb = f(b)
else:
a = c
fa = fc
graphFalsePosition(f, a=-1, b=1)
1.7.4. A refinement: always using the two most recent approximations — The Secant Method#
The basic strategy for improvement is to always discard the oldest approximation — at the cost of not always having the zero surrounded! This gives the Secant Method.
For a mathematical description, one typically enumerates the successive approximations as \(x_0\), \(x_1\), etc., so the notation above gets translated with \(a \to x_{k-2}\), \(b \to x_{k-1}\), \(c \to x_{k}\); then the formula becomes the recursive rule
Two differences from above:
previously we could assume that \(a<b\), but now we do not know the order of the various \(x_k\) values, and
the root is not necessarily bewtween the two most recent values, so we no longer have tht simple error bound. (We will see that the zero is typically surrounded two-thirds of the time!)
Instead, we use the magnitude of \(b-a\) (which is now \(|x_k - x_{k-1}|\)) as a measure of error, and this is only an estimate of the error. This is the same as used for Newton’s Method; as there, it is still useful as a condition for ending the iterations and indeed tends to be pessimistic, so that we typically do one more iteration than needed — but it is not on its own a complete guarantee of having achieved the desired accuracy.
Pseduo-code for a Secant Method Algorithm#
(Secant Method)
Input function \(f\), interval endpoints \(x_0\) and \(x_1\), an error tolerance \(E_{tol}\), and an iteration limit \(N\)
for k from 2 to N
\(\quad\) \(\displaystyle x_k \leftarrow \frac{x_{k-2} f(x_{k-1}) - f(x_{k-2}) x_{k-1}}{f(x_{k-1}) - f(x_{k-2})}\)
\(\quad\) Evaluate the error estimate \(E_{est} \leftarrow |x_k - x_{k-1}|\)
\(\quad\) if \(E_{est} \leq E_{tol}\)
\(\quad\quad\) End the iterations
\(\quad\) else
\(\quad\quad\) Go around another time
\(\quad\) end
end
Output the final \(x_k\) as the approximate root and \(E_{est}\) as an estimate of its absolute error.
Python Code for this Secant Method Algorithm#
We could write Python code that closely follows this notation, accumulating a list of the values \(x_k\).
However, since we only ever need the two most recent values to compute the new one, we can instead just store these three,
in the same way that we recylced the variables a
, b
and c
.
Here I use more descriptive names though:
def secantMethod(f, a, b, errorTolerance=1e-15, maxIterations=15, demoMode=False):
"""Solve f(x)=0 in the interval [a, b] by the Secant Method."""
if demoMode:
print(f"Solving by the Secant Method.")
# Some more descriptive names
x_older = a
x_more_recent = b
f_x_older = f(x_older)
f_x_more_recent = f(x_more_recent)
for iteration in range(maxIterations):
if demoMode: print(f"\nIteration {iteration}:")
x_new = (x_older * f_x_more_recent - f_x_older * x_more_recent)/(f_x_more_recent - f_x_older)
f_x_new = f(x_new)
(x_older, x_more_recent) = (x_more_recent, x_new)
(f_x_older, f_x_more_recent) = (f_x_more_recent, f_x_new)
errorEstimate = abs(x_older - x_more_recent)
if demoMode:
print(f"The latest pair of approximations are {x_older} and {x_more_recent},")
print(f"where the function's values are {f_x_older:0.4} and {f_x_more_recent:0.4} respectively.")
print(f"The new approximation is {x_new}, with estimated error {errorEstimate:0.4}, backward error {abs(f_x_new):0.4}")
if errorEstimate < errorTolerance:
break
# Whether we got here due to accuracy of running out of iterations,
# return the information we have, including an error estimate:
return (x_new, errorEstimate)
Note: As above, you could omit the above def
and instead import this function with
from numericalMethods import secantMethod
(root, errorEstimate) = secantMethod(f, a=-1, b=1, demoMode=True)
print(f"\nThe Secant Method gave approximate root is {root},")
print(f"with estimated error {errorEstimate:0.4}, backward error {abs(f(root)):0.4}")
Solving by the Secant Method.
Iteration 0:
The latest pair of approximations are 1 and 0.5403023058681398,
where the function's values are 0.4597 and -0.3173 respectively.
The new approximation is 0.5403023058681398, with estimated error 0.4597, backward error 0.3173
Iteration 1:
The latest pair of approximations are 0.5403023058681398 and 0.7280103614676171,
where the function's values are -0.3173 and -0.01849 respectively.
The new approximation is 0.7280103614676171, with estimated error 0.1877, backward error 0.01849
Iteration 2:
The latest pair of approximations are 0.7280103614676171 and 0.7396270126307336,
where the function's values are -0.01849 and 0.000907 respectively.
The new approximation is 0.7396270126307336, with estimated error 0.01162, backward error 0.000907
Iteration 3:
The latest pair of approximations are 0.7396270126307336 and 0.7390838007832723,
where the function's values are 0.000907 and -2.23e-06 respectively.
The new approximation is 0.7390838007832723, with estimated error 0.0005432, backward error 2.23e-06
Iteration 4:
The latest pair of approximations are 0.7390838007832723 and 0.7390851330557806,
where the function's values are -2.23e-06 and -2.667e-10 respectively.
The new approximation is 0.7390851330557806, with estimated error 1.332e-06, backward error 2.667e-10
Iteration 5:
The latest pair of approximations are 0.7390851330557806 and 0.7390851332151607,
where the function's values are -2.667e-10 and 0.0 respectively.
The new approximation is 0.7390851332151607, with estimated error 1.594e-10, backward error 0.0
Iteration 6:
The latest pair of approximations are 0.7390851332151607 and 0.7390851332151607,
where the function's values are 0.0 and 0.0 respectively.
The new approximation is 0.7390851332151607, with estimated error 0.0, backward error 0.0
The Secant Method gave approximate root is 0.7390851332151607,
with estimated error 0.0, backward error 0.0
def graphSecantMethod(f, a, b, maxIterations=5):
"""Graph a few iterations of the Secant Method for solving f(x)=0 in the interval [a, b].
"""
x_older = a
x_more_recent = b
f_x_older = f(x_older)
f_x_more_recent = f(x_more_recent)
for iteration in range(maxIterations):
x_new = (x_older * f_x_more_recent - f_x_older * x_more_recent)/(f_x_more_recent - f_x_older)
f_x_new = f(x_new)
latest_three_x_values = [x_older, x_more_recent, x_new]
left = np.min(latest_three_x_values)
right = np.max(latest_three_x_values)
x = np.linspace(left, right)
figure(figsize=[12,5])
title(f"Iteration {iteration+1}, Secant Method")
xlabel("$x$")
plot(x, f(x))
plot([left, right], [f(left), f(right)]) # the secant line
plot([left, right], [0, 0], 'k') # the x-axis line
plot(latest_three_x_values, f(latest_three_x_values), 'r*')
# show() # The Windows version of JupytLab might need this command
(x_older, x_more_recent) = (x_more_recent, x_new)
(f_x_older, f_x_more_recent) = (f_x_more_recent, f_x_new)
errorEstimate = abs(x_older - x_more_recent)
graphSecantMethod(f, a=-1, b=1)
1.7.5. Observations#
The secant method converges faster than the method of false position (and far faster than bisection).
The majority of iterations do have the root surrounded (sign-change in \(f\)), but every third one — the second and fifth — do not. This patterm of surrounding the the root every two times out of three is typical.
Comparing the error estimate to the backward error, the error estimte is in fact quite pessimistic (and so fairly trustworthy); in fact, it is typically of similar size to the backward error at the previous iteration.
The last point is a quite common occurence: the available error estimates are often “trailing indicators”, closer to the error in the previous approximation in an iteration. For example, recall that we saw the same thing with Newton’s Method when we used \(|x_k - x_{k-1}|\) to estimate the error \(E_k := x_k - r\) and saw that it is in fact closer to the previous error, \(E_{k-1}\).
1.7.6. Exercises#
Exercise 1#
Recall from Definition 1.12 that an iterative method giving \(x_n \to r\) has order of convergence \(p\) if
for some non-zero constant \(C\): “quadratic convergence” is order 2.
It can be shown (but not in this course) that when the secant method converges to a simple root \(r\) (one with \(f'(r) \neq 0\)), the order of convergence is \(\frac{1 + \sqrt{5}}{2}, = \phi \approx 1.6\), the so-called Golden Ratio.
Consider the “double secant method”, in which each step is two successive steps of the secant method. Show that when this converges to a simple root, the order of convergence is \(\phi^2\).
Noting that \(1 < \phi < 2 < \phi^2\), use this to illustrate that, for methods of super-linear convergence, the actual order of convergence is not very useful for comparing the speed of different methods. (The most useful speed distinction is just linear vs super-linear.)