4.1. Approximating Derivatives by the Method of Undetermined Coefficients#

Last revised on November 12, 2023

References:

We have seen several formulas for approximating a derivative \(Df(x)\) or higher derivative \(D^k f(x)\) in terms of several values of the function \(f\), such as

(4.1)#\[Df(x) \approx D_hf(x) := \frac{f(x+h) - f(x)}{h}\]

and

(4.2)#\[D^2f(x) \approx \delta^2 f(x) := \frac{f(x-h) - 2f(x) + f(x+h)}{h^2}.\]

For the first case we can use the Taylor formula for \(n=1\),

\[ f(x+h) = f(x) + Df(x) h + \frac12 {D^2f(\xi_x)} h^2 \quad \text{ where } \xi_x \text{ is between } x \text{ and } x+h \]

(see Equations (1.5) or (1.8) in the section on Taylor’s Theorem; this gives

\[ D_hf(x) = \frac{f(x+h) - f(x)}{h} = Df(x) + \frac{D^2f(\xi_x)}{2} h \]

leading to the error formula

\[ D_hf(x) - Df(x) = \frac12 {D^2f(\xi_x)}h, = O(h) \quad \text{ or equivalently, } o(1). \]

The approximations in equations (4.1) and (4.2) of \(k\)-th derivatives (\(k=1\) or \(2\) so far) are linear combinations of values of \(f\) at various points, with the denominator scaling with the k-th power of the mode spacing scale \(h\); this makes sense given the linearity of derivatives and the way that the k-th derivative scales when one rescales \(f(x)\) to \(f(ck)\).

Thus we will make the Ansatz that the k-th derivative \(D^k f(x)\) can be approximated using values at the \(r-l+1\) equally spaced points

\[\begin{equation*} x + lh, x + (l+1)h, \dots x + rh \end{equation*}\]

where the integers \(l\) and \(r\) can be negative, positive or zero. The assumed form then is

\[ D^k f(x) \approx D_h^k f(x) = \frac{C_l f(x + lh) + C_{l+1} f(x + (l+1)h) + \cdots + C_r f(x + rh)}{h^k} + O(h^p) \]

(The reason for the power \(k\) in the denominator will be seen soon.)

So we seek to determine the values of the initially undetermined coefficients \(C_i\), by the criterion of giving an error \(O(h^p)\) with the highest possible order \(p\). With \(r-l+1\) coefficients to choose, we generally get \(p = r - l + 1 - k\), but with symmetry \(l = -r\) and \(k\) even we get one better, \(p = r - l + 2 - k\), because the order \(p\) must then be even. Thus we need the number of points \(r-l+1\) to be more than \(k\): for example, at least two for a first derivative as already seen.

Example 4.1 (The basic forward difference approximation)

\[Df(x) = \frac{f(x + h) - f(x)}{h} + O(h)\]

has \(k=1\), \(l=0\), \(r=1\), \(p=1\).

Example 4.2 (A three-point one-sided difference approximation of the first derivative)

This is the case \(k=1\) and can be sought with \(l=0\), \(r=2\), as

\[Df(x) = \frac{C_{0} f(x) + C_1 f(x + h) + C_2 f(x + 2h)}{h} + O(h^p)\]

and the most accurate choice is \(C_0 = -3/2\), \(C_1 = 2\), \(C_2 = -1/2\), again of second order, which is exactly \(p = r - l + 1 - k\), with no “symmetry bonus”:

\[Df(x) \approx \frac{-3 f(x) + 4 f(x + h) - f(x + 2h)}{2 h} + O(h^2).\]

One can use Taylor’s Theorem to check an approximation like this, and also get information about its accuracy. To do this, insert a Taylor series formula with center \(x\), like

\[f(x+h) = f(x) + Df(x) h + \frac{D^2f(x)}{2} h^2 + \frac{D^3f(x)}{6} h^3 + \cdots\]

If you are not sure how accurate the result is, you might need to initially be vague about how may terms are needed, so I will do it that way and then go back and be more specific once we know more.

A series for \(f(x+2h)\) is also needed:

\[\begin{align*} f(x+2h) &= f(x) + Df(x) (2 h) + \frac{D^2f(x)}{2} (2 h)^2 + \frac{D^3f(x)}{6} (2 h)^3 + \cdots \\ &= f(x) + 2 Df(x) h + \frac{D^2f(x)}{2} 4 h^2 + \frac{D^3f(x)}{6} 8 h^3 + \cdots \\ &= f(x) + 2 Df(x) h + 2 D^2f(x) h^2 + \frac{4 D^3f(x)}{3} h^3 + \cdots \end{align*}\]

Insert these into the above three-point formula, and see how close it is to the exact derivative:

\[\begin{align*} &\frac{-3 f(x) + 4 f(x + h) - f(x + 2h)}{2 h} \\&= \frac{-3 f(x) + 4 [f(x) + Df(x) h + \frac{D^2f(x)}{2} h^2 + \frac{D^3f(x)}{6} h^3 + \cdots] - [f(x) + 2 Df(x) h + 2 D^2f(x) h^2 + \frac{4 D^3f(x)}{3} h^3 + \cdots]}{2 h} \end{align*}\]

Now gather terms with the same power of \(h\) (which is also gathering terms with the same order of derivative):

\[\begin{align*} \frac{-3 f(x) + 4 f(x + h) - f(x + 2h)}{2 h} &= f(x)\frac{-3 + 4 -1}{2h} + Df(x)\frac{4 -2}{2} + D^2f(x)\frac{4/4 - 2/2}h + D^3f(x)\frac{4/12 - 4/6}h^2 + \cdots \\&= Df(x) - \frac{D^3 f(x)}{3} h^2 + \cdots \end{align*}\]

and it is clear that the omitted terms have higher power of \(h\): \(h^3\) and up. That is, they are \(O(h^3)\), or more conveniently \(o(h^2)\).

Thus we have confirmed that the error in this approximation is

\[Df(x) - \frac{-3 f(x) + 4 f(x + h) - f(x + 2h)}{2 h} = \frac{D^3 f(x)}{3} h^2 + o(h^2) = O(h^2).\]

Example 4.3 (A three-point centered difference approximation of \(D^2 f(x)\))

This has \(k=2\), \(l = -1\), \(r = 1\) and so

\[D^2f(x) \approx \frac{C_{-1} f(x - h) + C_{0} f(x) + C_1 f(x + h)}{h^2}\]

and it can be found (as discussed below) that the coefficients \(C_{-1} = C_1 = 1\), \(C_0 = -2\) give the highest order error: \(p=2\); one better than \(p = r - l + 1 - k = 1\) due to symmetry:

\[D^2f(x) = \frac{f(x - h) -2 f(x) + f(x + h)}{h^2} + O(h^2).\]

4.1.1. Method 1: use Taylor polynomials in \(h\) of degree p+k-1#

(so with error terms \(O(h^{p+k})\).)

Each of the terms \(f(x + ih)\) in the above formula for the approximation \(D_h^k f(x)\) of the \(k\)-th derivative \(D_k f(x)\) can be expanded with the Taylor Formula up to order \(p+k\),

\[f(x + ih) = f(x) + (ih)Df(x) + (ih)^2/2 D^2f(x) + \cdots + (ih)^j/j! D^jf(x) + \cdots + (ih)^{p+k}/(p+k)! D^{p+k}f(x) + o(h^{p+k})\]

Then these can be rearranged, putting the terms with the same derivative \(D^j f(x)\) together — all of which have the same factor \(h^j\) in the numeriator, and so the same factor \(h^{j-p}\) overall:

\[\begin{split}\begin{split} D_h^k f(x) &= (C_l + \cdots + C_r)f(x)h^{-k}\\ &+ (l C_l + \cdots + r C_r)Df(x)h^{1-k}\\ &+ (l^2 C_l + \cdots + r^2 C_r)D^2f(x)h^{2-k}\\ & \vdots\\ &+ (l^j C_l + \cdots + r^j C_r)D^jf(x)h^{j-k}\\ & \vdots\\ &+ (l^{p+k} C_l + \cdots + {p+k}^j C_r)D^{p+k}f(x)h^{p}\\ &+ o(h^p) \end{split}\end{split}\]

The final “small” term \(o(h^{p})\) comes from the terms \(o(h^{p+k})\) in each Taylor’s formula term, each divided by \(h^k\).

We want this whole thing to be approximately \(D^k f(x)\), and the strategy is to match the coefficients of the derivatives:

  • Matching the coefficients of \(D_h^k f(x)\),

\[(l^k C_l + \cdots + r^k C_r) D^k f(x)h^{k-k} = (l^k C_l + \cdots + r^k C_r) D^k f(x) = D^k f(x)\]

so

\[l^k C_l + \cdots + r^k C_r = 1=\]
  • On the other hand, there should be no term with factor \(f(x)h^{-k}\), so

\[C_l + \cdots + C_r = 0\]
  • More generally, for any \(j\) other than \(k\) the coefficients should vanish, so

\[l^j C_l + \cdots + r^j C_r = 0, \quad 0 \leq j \leq p+k \text{ except for } j = k\]

This last line gives \(p+k\) linear equations in the \(p+k+1\) coefficients \(C_1 , \dots, C_{p+k}\), and then the previous equation gives us a total of \(p+k+1\) equations — as needed for the existence of a unique solution.

(4.3)#\[\begin{align} C_l + \cdots + C_r &= 0\\ l^j C_l + \cdots + r^jC_r &= 0, j \neq k\\ l^k C_l + \cdots + r^kC_r &= 1\\ \end{align}\]

And indeed it can be verified that the resulting matrix for this system of equations is non-singular, and so there is a unique solution for the coefficients \(C_l \dots C_r\).

Exercise A#

A) Derive the formula in Example 4.1.

Do this by setting up the three equations as above for the coefficients \(C_0\), \(C_1\) and \(C_2\), and solving them. Do this “by hand”, to get exact fractions as the answers; use the two Taylor series formulas, but now take advantage of what we saw above: that the error starts at the terms in \(D^3f(x)\). So use the forms

\[f(x+h) = f(x) + Df(x) h + \frac{D^2f(x)}{2} h^2 + \frac{D^3f(x)}{6} h^3 + O(h^4)\]

and

\[f(x+2h) = f(x) + 2 Df(x) h + 2 D^2f(x) h^2 + \frac{4 D^3f(x)}{3} h^3 + O(h^4)\]

B) Verify the result in Example 4.3.

Again, do this by hand, and exploit the symmetry. Note that it works a bit better than expected, due to the symmetry.

4.1.2. Degree of Precision and testing with monomials#

This concept relates to a simpler way of determining the coefficients.

The degree of precision of an approximation formula (of a derivative or integral) is the highest degree \(d\) such that the formula is exact for all polynomials of degree up to \(d\). For example it can be checked that in the examples above, the degrees of precision are 1, 2, and 3 respectively. All three conform to a general pattern:

Theorem 4.1

The degree of precision is \(d = p + k - 1\), so in the typical case with no “symmetry bonus” \(d = r - l\)

This is confirmed by the above derivation: for \(f\) any polynomial of degree \(p+k-1\) or less, the Taylor polynomials of degree at most \(p+k-1\) used there have no error.

Thus for example, the minimal symmetric aproximation of a fourth derivative, which must have even order \(p=2\), will have degree of precision 5.

4.1.3. Method 2: use monomials of degree up to p+k-1#

From the above degree of precision result, one can determine the coefficients by requiring degree of precision \(p+k-1\), and for this it is enough to require exactness for each of the simple monomial functions \(1\), \(x\), \(x^2\), and so on up to \(x^{p+k-1}\).

Also, this only needs to be tested at \(x=0\), since “translating” the variables does not effect the result.

This is probably the simplest method in practice.

Example 4.4

Let us revisit Example 4.2. The goal is to get exactness in

\[\frac{C_{0} f(x) + C_1 f(x + h) + C_2 f(x + 2h)}{h} = Df(x)\]

for the monomials \(f(x) = 1\), \(f(x) = x\), and so on, to the highest power possible, and this only needs to be checked at \(x=0\).

First, \(f(x) = 1\), so \(Df(0) = 0\):

\[\frac{C_{0} \times 1 + C_1 \times 1 + C_2 \times 1}{h} = 0,\]

so

\[C_{0} + C_1 + C_2 = 0\]

Next, \(f(x) = x\), so \(Df(0) = 1\):

\[\frac{C_{0} f(0) + C_1 f(h) + C_2 f(2h)}{h} = \frac{C_{0} 0 + C_1 h + C_2 2h}{h} = C_1 + 2C_2 = 1\]

so

\[C_1 + 2 C_2 = 1\]

We need at least three equations for the three unknown coefficients, so continue with \(f(x) = x^2\), \(Df(0) = 0\):

\[\frac{C_{0} f(0) + C_1 f(h) + C_2 f(2h)}{h} = \frac{C_{0} 0 + C_1 h^2 + C_2 (2h)^2}{h} = (C_1 + 4 C_2)h = 0\]

so

\[C_1 + 4 C_2 = 0\]

We can solve these by elimination; for example:

  • The last equation gives \(C_1 = -4C_2\)

  • The previous one then gives \(-4C_2 + 2C_2 = 1\), so \(C_2 = -1/2\) and thus \(C_1 = -4C_2 = 2\).

  • The first equation then gives \(C_0 = -C_1 - C_2 = -3/2\) all as claimed above.

So far the degree of precision has been shown to be at least 2. In some cases it is better, so let us check by looking at \(f(x) = x^3\):

\(Df(x) = 0\), whereas

\[\frac{-3 f(x) + 4 f(x + h) - f(x + 2h)}{2 h} = \frac{-3 0 + 4 h^3 - (2h)^3}{2 h} = \frac{ -2 h^3}{2 h} = -4 h^2, \neq 0\]

So, no luck this time (that typically requires some symmetry), but this calculation does indicate in a relatively simple way that the error is \(O(h^2)\).

Remark 4.1

If you want to verify more rigorously the order of accuracy of a formula devised by this method, one can use the “checking” procedure with Taylor polynomials and their error terms as done in Example 4.2 above.

Exercise B: like Exercise A, but using Method 2#

A) Verify the result in Example 4.1, this time by Method 2.

That is, impose the condition of giving the exact value for the derivative at \(x=0\) for the monomial \(f(x) = 1\), then the same for \(f(x) = x\), and so on until there are enough equations to determine a unique solution for the coefficients.

B) Verify the result in Example 4.3, by Method 2.