# 3.6. Least-squares Fitting to Data: Appendix on The Geometrical Approach#

**References:**

Chapter 4

*Least Squares*of [Sauer, 2022], Sections 1 and 2.Section 8.1

*Discrete Least Squares Approximation*of [Burden*et al.*, 2016].

## 3.6.1. Introduction#

We have seen that one common and important approach to approximating data

by a polynomial \(y = p(x)= c_0 + \cdots c_n x^n\) of degree at most \(n\) is to minimize the “average” of the errors

in the sense of the *root-mean-square error* \(E_{RMS} = \displaystyle \sqrt{\sum_{i=1}^N e_i^2}\).
Equivalently, we will avoid the square root and just minimize the sum of the squares of the errors:

## 3.6.2. Linear least squares: minimizing RMS error using calculus#

One way to derive the needed formulas is by seeking the critical point og th abev function via teh \(n+1\) equations

Fortunately these gives a systems of linear equations, and it has a unique solution, thus giving the desired global minimum.

However, there is another “geometrical” approach, that is also relevant as an introduction to strategies also used for other minimization problems, for example with application to the numerical solutions of boundary value problems for differential equations.

## 3.6.3. Linear least squares: minimizing RMS error by minimizing “Euclidean” distance with geometry#

For approximation by a polynomial \(y = p(x)= c_0 + \cdots c_n x^n\), we can think of the data \(y_i, 1 \leq i \leq N\) as giving a point in \(N\)-dimensional space \(\mathbb(R)^N\), and the approximations as giving another point with coordinates \(\tilde{y}_i := p(x_i)\).

Then the least squares problem is to minimize the Euclidean distance \(\| y - \tilde{y} \|_2\).

One way to think of this is that we attempt unsuccessfully to solve the collocation equations \(p(x_i) = y_i\) as an
*over-determined* sytem of \(N\) equations in \(n+1\) unknowns \(A c = y\), where

so that \(Ac\) evaluates the polynomial at all the \(x_i\) values.

Now we introduce a key geometrical idea: the possible values of \(\tilde{y} = Ac\) lie in an \((n+1)\)-dimensional sub-space or “hyperplane” within \(\mathbb{R}^N\), and the point in this hyper-plane closest to \(y \in \mathbb{R}^N\) is the perpendular projection of the latter point onto this hyper-plane: that is, the error vector \(e = y - \tilde{y}\) is perpendicular to every vector in the subspace of vectors \(Ac'\). Thus, \(e \perp A c'\) for every \(c' \in \mathbb{R}^{n+1}\).

Writing this in terms of inner products,

Recall that \((x, Ay) = (A^Tx, y)\) where \(A^T\) is the transpose of \(A\): the mirror image with \(a^T_{i,j} = a_{j, i}\).

Using this gives

and so the vector at left must be zero: \(A^T (y - \tilde{y})= 0\).

Inserting \(\tilde{y} = A c\) gives \(A^T y = A^T A c\), so

where \(M := A^T A\)

Since here \(A\) is \(N \times (n+1)\), \(A^T\) is \((n+1) \times N\), and the product \(M\) is an \((n+1) \times (n+1)\) square matrix.

Further calculation shows that in fact

and the right-hand side is

so these equations are the same ones \(M c = p\) given by the previous calculus derivation.