{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(section:error-measures-convergence-rates)=\n",
    "# Measures of Error and Order of Convergence"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Last revised on August 26, 2025**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**References:**\n",
    "\n",
    "- {cite}`Sauer` Section 1.3.1, *Forward and backward error*, on *measures of error*.\n",
    "\n",
    "- {cite}`Burden-Faires` Section 2.4, *Error Analysis for Iterative Methods*, on *order of convergence.*\n",
    "\n",
    "- {cite}`Dionne` Section 2.7, *Order of Convergence*.\n",
    "\n",
    "- {cite}`Chenney-Kincaid` Sections 2.2, *Absolute and Relative Erros: Loss of Significance* and 1.2 *Orders of Convergence and Additional Basic Concepts.*.\n",
    "\n",
    "- {cite}`Dahlquist-Bjorck-v1` Section 2.1, *Basic Concepts in Error Estimation*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These notes cover a number of topics:\n",
    "- Measures of error: absolute, relative, forward, backward, etc.\n",
    "- Measuring the rate of convergence of a sequence of approximations.\n",
    "- Big-O and little-o notation for describing how small a quantity (usually an error) is."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Error measures\n",
    "\n",
    "Several of these have been mentioned before, but they are worth gathering here.\n",
    "\n",
    "Consider a quantity $\\tilde{x}$ considered as an approximation of an exact value $x$.\n",
    "(This can be a number or a vector.)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````{prf:definition} Error\n",
    ":label: error-redux\n",
    "\n",
    "The **error** in $\\tilde{x}$ is $\\tilde{x} - x$ (or  $x - \\tilde{x}$;\n",
    "different sources use both versions and either is fine so long as you are **consistent**.)\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````{prf:definition} Absolute Error\n",
    ":label: absolute-error\n",
    "The **absolute error** is the absolute value of the error: $|\\tilde{x} - x|$.\n",
    "For vector quantities this means the norm $\\| \\tilde{x} - x \\|$,\n",
    "and it can be any norm, so long as we again choose one and use it consistently.\n",
    "Two favorites are the *Euclidean norm* $\\| x \\| = \\sqrt{\\sum |x_i|}$, denoted $\\| x \\|_2$,\n",
    "and the *maximum norm* (also mysteriously known at the *infinity norm*):\n",
    "\n",
    "$$\\| x \\|_\\text{max} = \\| x \\|_\\infty = \\max_i |x_i|.$$\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For real-valued quantities, the absolute error is related to the number of **correct decimal places**:\n",
    "$p$ decimal places of accuracy corresponds roughly to absolute error no more than $0.5 \\times 10^{-p}$,\n",
    "and when working with computer arithmetic, $p$ corrrect **bits** corresponds to absolute error no more than $2^{-(p+1)}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````{prf:definition} Relative Error\n",
    ":label: relative-error\n",
    "The **relative error** is the ratio of the absolute error to the size of the exact quantity:\n",
    "$\\displaystyle\\frac{| \\tilde{x} - x |}{| x |}$ (again possibly with vectors and vector norms).\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is often more relevant than absolute error for inherently positive quantities,\n",
    "but is obviously unwise where $x = 0$ is a possibility.\n",
    "\n",
    "For real-valued quantities, this is related to the number of **significant digits**:\n",
    "accuracy to $p$ significant digits corresponds roughly to relative error no more than $0.5 \\times 10^{-p}$.\n",
    "\n",
    "When working with computer arithmetic, $p$ significant **bits** corresponds to relative error no more than $2^{-(p+1)}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Backward error (and forward error)\n",
    "\n",
    "An obvious problem is that we usually do not know the exact solution $x$, so cannot evaluate any of these; instead we typically seek **upper bounds** on the absolute or relative error.\n",
    "Thus, when talking of approximate solutions to an equation $f(x) = 0$ the concept of **backward error** introduced in the section on [Newton's method](section:newtons-method) can be very useful,\n",
    "for example as a step in getting bounds on the size of the error.\n",
    "To recap:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````{prf:definition} Backward Error\n",
    ":label: backward-error-redux\n",
    "\n",
    "The **backward error** in $\\tilde{x}$ as an approxiate solution to the equation $f(x) = 0$ is $f(\\tilde{x})$;\n",
    "the amount by which function $f$ would have to be changed in order for $\\tilde{x}$ to be an exact root.\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the case of solving simultaneous linear equations in matrix-vector form $A x = b$,\n",
    "this is $b - A \\tilde{x}$, also known as the **residual**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````{prf:definition} Absolute Backward Error\n",
    ":label: absolute-backward-error\n",
    "\n",
    "The absolute backward error is — as you might have guessed — the absolute value of the backward error: $|f(\\tilde{x})|$.\n",
    "This is sometimes also called simply the backward error. (The usual disclaimer about vector quantities applies.)\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the case of solving simultaneous linear equations in matrix-vector form $A x = b$,\n",
    "this is $\\| b - A \\tilde{x} \\|$,\n",
    "also known as the **residual norm**;\n",
    "we will see more of this in the section on [Error bounds for linear algebra, etc.](section:linear-algebra-error-bounds).\n",
    "\n",
    "see also [stuff about matrix norms](matrix-norms).\n",
    "\n",
    "we will see more of this in the section {doc}`linear-algebra-5-error-bounds-condition-numbers-python`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```{prf:remark}\n",
    ":label: error-measures-convergence-rates-remark-1\n",
    "\n",
    "- One obvious advantage of the backward error concept is that you can actually evaluate it without knowing the exact solution $x$.\n",
    "\n",
    "- Also, one significance of backward error is that if the values of $f(x)$ are only known to be accurate within an absolute error of $E$ then any approximation with absolute backward error less than $E$ could in fact be exact, so there is no point in seeking greater accuracy.\n",
    "\n",
    "- The names *forward error* and *absolute forward error* are sometimes used as synonyms for *error* etc. as defined above, when they need to be distinguished from backward errors.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Order of convergence of a sequence of approximations\n",
    "\n",
    "````{prf:definition} Linear Convergence\n",
    ":label: linear-convergence\n",
    "\n",
    "We have seen that for the sequence of approximations $x_k$ to a quantity $x$ given by the fixed point iteration\n",
    "$x_{k+1} = g(x_k)$, the absolute errors $E_k := |x_k - x|$ typically have\n",
    "\n",
    "$$\\frac{E_{k+1}}{E_k} \\to C = |g'(x)|.$$\n",
    "\n",
    "so that eventually the errors diminsh in a roughly geometric fashion: $E_k \\approx K C^k$.\n",
    "This is called **linear convergence**.\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Why \"linear\" rather than \"geometric\"?\n",
    "Because there is an approximately linear relationship between consecutive error values,\n",
    "\n",
    "$$E_{n+1} \\approx C E_n.$$\n",
    "\n",
    "This is a very common behavior for iterative numerical methods, but we will also see that a few methods do even better; for example, when Newton's method converges to a simple root $r$ of $f$ (one with $f'(r) \\neq 0$)\n",
    "\n",
    "$$E_{k+1} \\approx C E_k^2$$\n",
    "\n",
    "which is called **quadratic convergence**.\n",
    "More generally:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````{prf:definition} Convergence of Order $p$\n",
    ":label: convergence-of-order-p\n",
    "\n",
    "This is when\n",
    "\n",
    "$$E_{k+1} \\approx C E_k^p, \\text{ or more precisely, } \\lim_{k \\to \\infty} \\frac{E_{k+1}}{E_k^p} \\text{ is finite.}$$\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have already observed experimentally the intermediate result that \"$C = 0$\" for Newton's method in this case; that is,\n",
    "\n",
    "$$\\frac{E_{k+1}}{E_k} \\to 0.$$ (super-linear)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "````{prf:definition} Super-linear Convergence\n",
    ":label: super-linear\n",
    "\n",
    "When the successive errors behave as in Equation {eq}`super-linear` the convergence is **super-linear**.\n",
    "This includes any situation with order of convergence $p > 1$.\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For most practical purposes, if you have established super-linear convergence,\n",
    "you can be happy, and not worry much about refinements like the particular order $p$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Big-O and little-o notation\n",
    "---------------------------\n",
    "\n",
    "Consider the error formula for approximation of a function $f$ with the Taylor polynomial of degree $n$, center $a$:\n",
    "\n",
    "$$\n",
    "| f(a+h) - T_n(h) | \\leq \\frac{M_{n+1}}{(n+1)!} |h|^{n+1}\n",
    "\\text{ where } M_{n+1} = \\max |f^{(n+1)}(x)|.\n",
    "$$\n",
    "\n",
    "Since the coefficient of $h^{n+1}$ is typicaly not known in practice, it is wise to focus on the power law part,\n",
    "and for this the \"big-O\" and little-o\" notation is convenient."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If a function $E(h)$ goes to zero at least as fast as $h^p$, we say that it is *of order $h^p$*, written $O(h^p)$.\n",
    "\n",
    "More precisely, $E(h)$ is no bigger than a multiple of $h^p$ for $h$ small enough;\n",
    "that is, there is a constant $C$ such that for some positive number $\\delta$\n",
    "\n",
    "$$\\frac{|E(h)|}{|h|^p} \\leq C \\text{ for } |h| < \\delta.$$\n",
    "\n",
    "Another way to say this is in terms of the *lim-sup*, if you have seen that jargon:\n",
    "\n",
    "$$\\limsup_{h \\to 0} \\frac{|E(h)|}{|h|^p} \\text{ is finite.}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This can be used to rephrase the above Taylor's theorem error bound as \n",
    "\n",
    "$$f(x) - T_n(x) = O(|x-a|^{n+1})$$\n",
    "\n",
    "or\n",
    "\n",
    "$$f(a+h) - T_n(h) = O(h^{n+1}),$$\n",
    "\n",
    "and for the case of the linearization,\n",
    "\n",
    "$$\n",
    "f(a+h) - L(x) = f(a+h) - (f(a) + f'(a) h) = O(h^2).\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Little-o notation, for \"negligibly small terms\"\n",
    "\n",
    "Sometimes it is enough to say that some error term is small enough to be neglected, at least when $h$ is close enough to zero.\n",
    "For example, with a Taylor series we might be able to neglect the powers of $x-a$ or of $h$ higher than $p$.\n",
    "\n",
    "We will thus say that a quantity $E(h)$ is *small of order $h^p$*, written $o(h^p)$\n",
    "when\n",
    "\n",
    "$$\\lim_{h \\to 0} \\frac{|E(h)|}{|h|^p} = 0.$$\n",
    "\n",
    "Note the addition of the word **small** compared to the above description of the big-O case!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With this, the Taylor's theorem error bound can be stated as\n",
    "\n",
    "$$f(x) - T_n(x) = o(|x-a|^n),$$\n",
    "\n",
    "or\n",
    "\n",
    "$$f(a+h) - T_n(h) = o(h^n),$$\n",
    "\n",
    "and for the case of the linearization,\n",
    "\n",
    "$$f(a+h) - L(x) = f(a+h) - (f(a) + f'(a) h) = o(h).$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "```{exercise-start}\n",
    ":label: error-measures-convergence-rates-exercise-1\n",
    "```\n",
    "For the function $f(x) = 1 - \\cos x$ or $f(x) = x - \\sin x$.\n",
    "\n",
    "a) Find the multiplicity of the root $r = 0$.\n",
    "\n",
    "b) Evaluate the forward and backward errors of the approximate root $\\tilde{r} = 0.001$.\n",
    "```{exercise-end}\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes for {ref}`error-measures-convergence-rates-exercise-1`: the multiplicity of a root**\n",
    "\n",
    "Recall the the multiplicity $m$ of a root $r$ of a polynomial is the number of factors $x-r$ that it has: it has a factor $(x-r)^m$ but not $(x-r)^{m+1}$.\n",
    "\n",
    "This can be extended to roots of other functions: one way is to note that at a root of multiplicity $m$, the first $m-1$ derivatives vanish: $f^{(1)}(r) = \\dots f^{(m-1)}(r) = 0$,\n",
    "but $f^{(m)}(r) \\neq 0$.\n",
    "\n",
    "This is used as the more general definition of the multiplicity.\n",
    "\n",
    "**Aside:**\n",
    "This is equivalent to having $f(x) = (x-r)^m g(x)$ where $g(x)$ is continuous at $r$,\n",
    "and to the Taylor series for $f$ with center $r$ having the form\n",
    "\n",
    "$$f(x) = c_m (x-r)^m + c_{m+1} (x-r)^{m+1} + \\cdots.$$\n",
    "\n",
    "Note that we can say that Newton's method has problems when the root being computed is of multiplicity greater than 1, and the above exercise examines one of these problems."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}