quantdev.blog - Interpolation and Approximation

Introduction

In this blog post, I would like to implement some interpolation algorithms using modern C++. It’s important we understand how and why these algorithms work. Because there are nuances, exceptions and corner-cases, we need to understand. It’s a great learning experience! You never fully understand something like the \(QR\)-algorithm (which is actually incredibly complicated to get the details right), unless you spend some time implementing it. A lot of the skills you pick up will translate when you write numerical codes.

Even if you have excellent implementations of any algorithm, you still need to understand some of the internals in order to make good use of your library. For example, any linear algebra library can compute an inverse matrix really fast and as accurately as mathematics allows. But, you have to know a little theory in order to know that you should still avoid using that if at all possible. (Matrix decomposition and solving without explicitly computing the inverse is better, generally speaking).

The Interpolation Problem

Polynomials are used as the basic means of approximation and are ubiquitous in nearly all areas of computational science.

Let \(a=x_1 < x_2 < \ldots < x_n = b\) be a grid of distinct points. Let \(\mathcal{P}_n\) be the vector space of of polynomials in one variable with degree \(\leq n - 1\). We are interested to find a polynomial \(p \in \mathcal{P}_n\) such that:

\[ \begin{align*} p(x_i) = f(x_i), \quad i = 1 : n \end{align*} \tag{1}\]

Theorem 1 (Uniqueness of the interpolating polynomial) If \(x_1,\ldots,x_n\) are distinct real numbers, then for arbitrary values \(y_1,\ldots,y_n\), there is a unique polynomial \(p\in \mathcal{P}_{n}\) of degree at most \(n-1\) such that:

\[p(x_i) = y_i, \quad i = 1 : n\]

Proof.

Suppose that there were two such polynomials \(p\) and \(q\). Then, the polynomial \(p-q\) would have the property \((p-q)(x_i)=0\) for \(1 \leq i \leq n\). Since, the degree of \((p-q)\) can be at most \(n-1\), this polynomial can have atmost \((n-1)\) zeroes, if is not the \(0\) polynomial. Since the \(x_i\)’s are distinct, it follows that:

\[ (p-q)(x) = (x - x_1)(x - x_2)\ldots(x-x_n) = \prod_{i=1}^n (x-x_i) \]

and it has atleast \(n\) zeroes. Hence, \((p-q)(x)\equiv 0\) - it must be identically equal to zero. So, \(p(x) = q(x)\) for all \(x\). This closes the proof. \(\blacksquare\)

Bases for polynomial interpolation

A set of polynomials \(\{p_1(x), p_2(x), \ldots, p_n(x)\}\) such that the polynomial \(p \in \mathcal{P}_n\) can be expressed as a linear combination :

\[ p(x) = \sum_{j=1} c_j p_j(x) \]

is called a basis for \(\mathcal{P}_n\). The column vector \(c=(c_1,c_2,\ldots,c_n)^T\) can be viewed as the coordinate vector of \(p\) in the polynomial space \(\mathcal{P}_n\). The inerpolation problem leads to a system of equations:

\[ \begin{align*} c_1 p_1(x_i) + c_2 p_2(x_i) + \ldots + c_n p_n(x_i) = f(x_i), \quad i=1:n \end{align*} \tag{2}\]

If we introduce the matrix :

\[ \begin{align*} P_n = [p_j(x_i)]_{i,j=1}^n \end{align*} \tag{3}\]

and the column vector \(f=(f(x_1),\ldots,f(x_n))^T\), then the linear system becomes:

\[ \begin{align*} P_n c = f \end{align*} \tag{4}\]

Mathematically, the choice of a basis (for a finite-dimensional space) makes no difference. Computationally, when working with rounded values of coefficients, the choice of basis can make a great difference. If the purpose is to compute derivatives or integrals of the interpolation polynomial, the power basis or the shifted power basis, where \(p_j(x) = (x - c)^{j-1}\) that is:

\[ p(x)= \sum_{j=1}^n c_j(x)(x - c)^{j-1} \]

is convenient although not always the best. If a shifted power basis is to be used for polynomial approximation on an interval \([a,b]\), it is often the best to choose \(c = (a + b)/2\), equal to the midpoint of the interval.

For the power basis \(p_j(x) = x^{j-1}\), the coefficients of the interpolation polynomial are given by the solution of the linear system \(V_n^T c = f\), where \(V_n\) is the Vandermonde matrix

\[ V_n = [x_j^{i-1}]_{i,j=1}^n = \begin{bmatrix} 1 & 1 & \ldots & 1\\ x_1 & x_2 & \ldots & x_n \\ \vdots & \vdots & \ldots & \vdots\\ x_1^{n-1} & x_2^{n-1} & \ldots & x_n^{n-1} \end{bmatrix} \tag{5}\]

Piecewise Polynomial Interpolation

Interpolating a given function by a single polynomial over its entire range can be an ill-conditioned problem, as illustrated by Runge’s phenomenon. On the other hand, polynomials of low degree can give good approximations locally in a small interval. I would like to discuss approximation schemes for piecewise polynomial interpolation with different degrees of global continuity.

With the use of piecewise polynomials, there is no reason to fear equidistant data, as opposed to the situation with higher-degree polynomials. In computer graphics and computer-aided design(CAD), curves and surfaces have to be represented mathematically, so that they can be manipulated and visualized easily. In 1962, Bezier and de Casteljau, working for French car companies Renault and Citroen, independently developed Bezier curves for fitting curves and surfaces. Similar work, using bicubic splines, was done in USA at general motors by Garret Birkhoff and Henry Garabedian.

Today, Bezier curves and spline functions are used extensively in all aircraft and automotive industries. Spline functions can also be used in the numerical treatment of boundary value problems for differential equations. Bezier curves have found use in computer graphics and typography. Trutype font glyphs are made of quadratic bezier curves.

Bernstein Polynomials and Bezier Curves

Parametric curves are often used find the functional form of a curve given geometrically by a set of points \(p_i \in \mathbf{R}^d\), \(i=0:n\).

Let \(c(t) \in \mathbf{R}^d\), \(t\in[0,1]\), be a parameteric curve. In the simplest case, \(n=1\), we take \(c(t)\) to be linear:

\[ c(t) = (1-t)p_0 + tp_1 \]

and connecting the two points \(p_0\) and \(p_1\), so that \(p(0)=c_0\) and \(p_1 = c(1)\). It is the parametric equation for a straight-line.

For \(n > 1\), this will not give a smooth curve and is therefore of limited interest.

We can generalize this approach and take \(c(t)\) to be a polynomial of degree \(n\):

\[ c(t) =\sum_{i=0}^{n-1} p_i B_i(t),\quad t\in[0,1] \]

where \(B_i(t)\), \(i=0 : n\) are the Bernstein polynomials defined by :

\[ B_i^{n}(t) = {n \choose i} t^{i} (1-t)^{n-i}, \quad i=0:n \]

Using the binomial theorem, we have:

\[ 1 = ((1-t) + t)^n = \sum_{i=0}^n {n \choose i}t^i (1-t)^{n-i} = \sum_{i=0}^n B_i^{n}(t) \]

Thus, the Bernstein polynomials of degree \(n\) are non-negative on \([0,1]\) and give a partition of unity.

For \(n=3\), the four cubic Bernstein polynmials are:

\[ \begin{align*} B_0^3 &= (1-t)^3\\ B_1^3 &= 3(1-t)^2 t\\ B_2^3 &= 3(1-t)t^2\\ B_3^3 &= t^3 \end{align*} \]

using Plots
using LaTeXStrings

B₀(t) = (1-t)^3
B₁(t) = 3*(1-t)^2*t
B₂(t) = 3*(1-t)*t^2
B₃(t) = t^3

plot([B₀, B₁, B₂, B₃], 0.0, 1.0, label=[L"B_0" L"B_1" L"B_2" L"B_3"])

Some important properties of the Bernstein polynomials are given in the following theorem.

Theorem 2 (Berstein polynomial properties) The Bernstein polynomials \(B_i^{n}(t)\) have the following properties:

Non-negativity: \(B_i^{n}(t) > 0\), \(t\in(0,1)\)
Symmetry: \(B_i^{n}(t)=B_{n-i}^{n}(1-t)\)
\(B_i^{n}(t)\) has a root \(t=0\) of multiplicity \(i\) and a root \(t=1\) of multiplicity \(n-i\)
The Bernstein polynomials \(B_i^{n}(t)\) have a unique maximum value at \(t=i/n\) on \([0,1]\)
The Bernstein polynomials satisfy the following recursion formula: \[ \begin{align*} B_i^n(t) = (1-t)B_i^{n-1}(t) + tB_{i-1}^{n-1}(t),\quad i=0:n \end{align*} \tag{6}\]

The Bernstein polynomials of degree \(n\) form a basis for the space of polynomials of degree \(\leq n\).

Proof.

Non-negativity: For \(t\in[0,1]\), \(0<1-t<1\), so \(B_i^{n}(t) \geq 0\).

Symmetry: Since \({n \choose k} = {n \choose n-k}\), we have:

\[ B_k^{n}(t) = {n \choose k}t^k (1-t)^{n-k} = {n \choose (n-k)} (1-t)^{n-k} t^k = B_{n-k}^n(1-t) \]

Roots. By definition, \(B_k^{n}(t) = {n \choose k}t^k (1-t)^{n-k}\) so it has a root \(t=0\) with multiplicity \(k\) and a root \(t=1\) with multiplicity \((n-k)\).

Moreover, differentiating \(B_k^{n}(t)\) with respect to \(t\), setting the first derivative equal to \(0\), we have:

\[ \begin{align*} \frac{d}{dt}(B_k^{n}(t)) &= {n \choose k} kt^{k-1} (1-t)^{n-k} - (n-k)t^k(1-t)^{n-k-1} = 0 \\ 0 &= k(1-t) - (n-k)t \\ 0 &= k - kt - nt + kt \\ nt &= k \\ t &= \frac{k}{n} \end{align*} \]

Consider the combinatorial identity:

\[ {n \choose k} = {n-1 \choose k} + {n - 1\choose k - 1} \]

Assume that we would like to assemble team of size \(k\) from a population of size \(n\). There are \({n \choose k}\) distinguishable teams. Another way to count is as follows. Label one member of the population as president. Then, there are \({n - 1 \choose k}\) distinguishable teams that always include the president and \({n - 1 \choose k - 1}\) distinct teams excluding the president. The sum must equal \({n \choose k}\).

We can use this to prove the recursion formula:

\[ \begin{align*} {n \choose k}t^k(1-t)^{n-k} &= {n-1 \choose k}t^k(1-t)^{n-k} + {n - 1\choose k - 1}t^k(1-t)^{n-k}\\ &= (1-t) {n-1 \choose k}t^k(1-t)^{n-1-k} + t{n - 1\choose k - 1}t^{k-1}(1-t)^{(n-1)-(k-1)}\\ &= (1-t)B_{k}^{n-1}(t) + tB_{k-1}^{n-1}(t) \end{align*} \]

To show the linear independence, we observe that if:

\[ \begin{align*} \sum_{i=0}^n a_i B_i^{n}(t) &\equiv 0 \end{align*} \tag{7}\]

for all \(t \in [0,1]\). Then, expanding and substituting \(t=1\) in the above expression, we have:

\[ \begin{align*} \sum_{i=0}^n a_i B_i^{n}(i) &= 0\\ a_0 (1-t)^n + a_1 {n \choose 1} (1-t)^{n-1}t + \ldots + a_n t^n &= 0\\ a_n &= 0 \end{align*} \]

Substituting \(a_n = 0\) in Equation 7, we get:

\[ \begin{align*} a_0 (1-t)^n + a_1 {n \choose 1} (1-t)^{n-1} t + \ldots + a_{n-1}(1-t)t^{n-1} &= 0\\ a_0 (1-t)^{n-1} + a_1 {n \choose 1} (1-t)^{n-2} t + \ldots + a_{n-1}t^{n-1} &= 0 \end{align*} \]

Again, subbing \(t=1\), we find that \(a_{n-1}=0\). By repeatedly dividing by \((1-t)\) and using the same argument, we find that:

\[ a_0 = a_1 = \ldots = a_n = 0 \]

This closes the proof. \(\blacksquare\)

The unique parametric Bezier curve corresponding to a given set of \(n+1\) control points \(p_i\), \(i=0:n\), equals:

\[ \begin{align*} c(t) = \sum_{i=0}^n p_i B_i^{n}(t), \quad t\in[0,1] \end{align*} \tag{8}\]

where \(B_i^{n}(t)\) are Bernstein polynomials of degree \(n\). By property 3, in Theorem 2, the Bezier curve interpolates the first and last control points \(p_0\) and \(p_n\). Often, a curve is constructed by smoothly patching together several Bezier curves of low order.

Intuition

The case \(n=1\)

Imagine a particle travelling in a straight line joining the points \(p_0=(x_0,y_0)\) and \(p_1=(x_1,y_1)\), where the time \(t\in[0,1]\). Let’s compute the position \(c(t)=(x(t),y(t))\) of the particle at time \(t\). The point \(c(t)\) divides the straight-line \(p_0p_1\) in the proportion \(t:(1-t)\). By the section formula, the position vector of \(c(t)\) is:

\[ \begin{align*} (x(t),y(t)) &= ((1-t)x_0 + tx_1, (1-t)y_0 + ty_1)\\ &= (1-t)(x_0,y_0) + t(x_1,y_1)\\ &= (1-t)p_0 + tp_1 \end{align*} \]

This is the parametric equation of motion of the particle. It is the Bezier curve for \(n=1\). The points \(p_0\) and \(p_1\) control what the straight-line path looks like, so they are called control-points.

using Plots
using LaTeXStrings

function bezier_1(t, a, b)
    @. (1-t)*a + t*b
end

tvec = range(0.0, 1.0, 101)
pts = [
    0.0 1.0;
    1.0 0.0;
]

x = bezier_1(tvec, pts[1,1], pts[2,1])
y = bezier_1(tvec, pts[1,2], pts[2,2])

@gif for (xVal, yVal) in zip(x,y)
    plot(x, y, line=(:path,:dash,:gray))
    scatter!([xVal], [yVal], marker=(:circle,3,:green,:green))
end

[ Info: Saved animation to C:\Data\dev\repo\quantinsights.github.io\posts\interpolation-and-approximation\tmp.gif

The case \(n=2\)

Assume that, we have 3 points \(p_0=(0.0, 1.0)\), \(p_1=(1.3, 1.3)\) and \(p_2=(1.0, 0.0)\). Imagine a red particle \(p_R\) moving from \(p_0\) towards \(p_1\) on a straight-line, a blue particle \(p_B\) moving from \(p_1\) towards \(p_2\). Suppose, a third green particle \(p_G\) moves along the straight-line joining the blue and green particles instantaneously:

Show the code

function bezier_2(t, a, b, c)
    @. (1-t)^2 *a + 2*(1-t)*t*b + t^2 * c
end

pts = [
    0.0 1.0;
    1.3 1.3;
    1.0 0.0;
]

xRed = bezier_1(tvec, pts[1,1], pts[2,1])
yRed = bezier_1(tvec, pts[1,2], pts[2,2])

xBlue = bezier_1(tvec, pts[2,1], pts[3,1])
yBlue = bezier_1(tvec, pts[2,2], pts[3,2])

xGreen = bezier_2(tvec, pts[1,1], pts[2,1], pts[3,1])
yGreen = bezier_2(tvec, pts[1,2], pts[2,2], pts[3,2])

@gif for ((xr, yr),(xb,yb),(xg,yg)) in zip(zip(xRed,yRed),zip(xBlue,yBlue),zip(xGreen,yGreen))
        plot(xRed, yRed, line=(:path,:dash,:gray))
        scatter!([xr], [yr], marker=(:circle,3,:red,:red))
        plot!(xBlue, yBlue, line=(:path,:dash,:gray))
        scatter!([xb], [yb], marker=(:circle,3,:blue,:blue))
        x = bezier_1(tvec, xr, xb)
        y = bezier_1(tvec, yr, yb)
        plot!(x, y, line=(:path,:dash,:gray))
        plot!(xGreen, yGreen, line=(:path,:solid,:gray))
        scatter!([xg], [yg], marker=(:circle,3,:green,:green))
end

[ Info: Saved animation to C:\Data\dev\repo\quantinsights.github.io\posts\interpolation-and-approximation\tmp.gif

How might we compute the trajectory of the green particle? By a double-application of the section formula, we have:

\[ \begin{align*} p_G(t) &= (1-t)\cdot p_R(t) + t\cdot p_B(t)\\ &= (1-t)((1-t)p_0 + tp_1) + t((1-t)p_1 + t p_2)\\ &= (1-t)^2 p_0 +2t(1-t)p_1 + t^2 p_2 \end{align*} \]

Thus, the trajectory of the green particle is a quadratic Bezier curve with \(n+1=3\) control points. The quadratic Bezier curve interpolates between the points \(p_0\) and \(p_2\), whereas \(p_1\) is an off-curve point.

Computer graphics(CG) aficionados reserve the term control point for an off-curve point such as \(p_1\) and refer to the on-curve points, as anchor points. In CG editors such as Adobe Illustrator, not all control points are known in advance. The shape of the quadratic curve is controlled by moving around the control points using the Pen tool (Bezier tool) until the curve has the desired shape. Thus, using the Bernstein basis to represent degree 2 polynomials is advantageous. Moving \(p_1\) has a direct and intuitive effect on the curve.

The Bezier polygon is the closed piecewise linear curve connecting the control points \(p_i\) and \(p_{i+1}\), \(i=0:n-1\) and finally \(p_n\) and back to \(p_0\). This polygon provides a rough idea of the shape of the curve. From the definition(Equation 8) of the Bezier curve, it follows that for all \(t\in[0,1]\), the curve \(c(t)\) is a convex combination of the control points. Therefore, \(c(t)\) lies within the convex hull of the control points.

Theorem 3 The Bezier curve \(c(t)\) is tangent to \(p_1- p_0\) and \(p_n - p_{n-1}\) for \(t=0\) and \(t=1\) respectively.

Proof.

Spline Functions

The mathematical concept of spline functions was introduced in 1946 by Schoenberg. The importance of the B-spline basis for approximation was also first appreciated by Schoenberg. Today, B-Splines enable the mathematical representation of surfaces far beyond hand-techniques. In aircraft design computations, they may involve more than \(50,000\) data points.

Linear and Cubic Splines

We start by formally defining a spline function of order \(k \geq 1\).

Definition 1 A spline function \(S(x)\) of order \(k \geq 1\) (degree \(k-1 \geq 0\)), on a grid

\[ \Delta = \{a=x_0 < x_1 < \ldots < x_n = b\} \]

of distinct knots is a real function \(s\) with the following properties:

For \(x \in [x_i, x_{i+1}]\), \(i=0:m-1\), \(S(x)\) is a polynomial of degree \(<k\).