Tridiagonal Systems

Numerical Methods
Author

Quasar

Published

November 15, 2024

Introduction

The special case of a system of linear equations that is tridiagonal, that is, has non-zero elements only on the diagonal plus or minus one column, is one that occurs frequently. Also common are systems that are band-diagonal, with non-zero elements only along a few diagonal lines adjacent to the main diagonal (above and below).

For triadiagonal sets, the procedures \(LU\)-decomposition, forward- and back- substitution each take only \(O(n)\) operations and the whole solution can be coded very concisely. In this blog post, I am going to explore solving triadiagonal matrix systems. I closely follow Daniel Duffy’s exposition in Chapter 13 of his excellent book Financial Instrument Pricing using C++.

Let \(A\) be a \(m \times n\) general banded matrix with \(kl\) subdiagonals and \(ku\) superdiagonals. Then, \(a_{ij}=0\), when \(|i - j| > kl + ku + 1\). All non-zero elements are positioned on the main diagonal, \(kl\) subdiagonals below it and \(ku\) superdiagonals above it.

  • A diagonal matrix is a \(n \times n\) band matrix with \(kl = ku = 0\).

  • A Toeplitz matrix is a \(n \times n\) band matrix \(T_n=[t_{k,j};k,j=0,1,\ldots,n-1]\) where \(t_{k,j}=t_{k-j}\). That is, a matrix of the form: \[ T_n = \begin{bmatrix} t_0 & t_{-1} & t_{-2} & \ldots & t_{-(n-1)}\\ t_1 & t_0 & t_{-1} & \ldots & t_{-(n-2)}\\ t_2 & t_1 & t_{0} & \ldots & t_{-(n-3)}\\ \vdots & & & \ddots & \\ t_{n-1} & t_{n-2} & t_{n-3} & \ldots & t_{0} \end{bmatrix} \]

  • A tridiagonal (Jacobi) matrix is a \(n \times n\) band matrix of width three \(kl = ku = 1\). \[ \begin{bmatrix} b_0 & c_0 & 0 & \ldots \\ a_1 & b_1 & c_1 & \ldots \\ 0 & a_2 & b_2 & \ldots \\ & & & \ldots \\ & & & \ldots & a_{n-2} & b_{n-2} & c_{n-2}\\ & & & \ldots & 0 & a_{n-1} & b_{n-1} \end{bmatrix} \]

Consider a two-point boundary value problem on the interval \((0,1)\) with Dirichlet boundary conditions:

\[ \begin{align*} \frac{d^2 u}{d x^2} &= f(x), \quad 0 < x < 1\\ u(0) &= \phi, \\ u(1) &= \psi \end{align*} \tag{1}\]

We approximate the solution \(u\) by creating a discrete mesh of points defined by \(\{x_j\}\), \(j=0,\ldots,N\) where \(N\) is a positive integer. At each interior mesh point the second derivative in the Equation 1 can be approximated by a second-order divided difference. The corresponding discrete scheme is:

\[ \begin{matrix} U_0 &- 2U_1 &+ U_2 & & & & & & & &= h^2 f_1 \\ & U_1 &- 2U_2 &+ U_3 & & & & & & &= h^2 f_2 \\ & & U_2 &- 2U_3 &+ U_4 & & & & & &= h^2 f_3 \\ & & & & & \ldots & & & & & \vdots \\ & & & & & \ldots & U_{N-3} &- 2U_{N-2} &+ U_{N-1} & &= h^2 f_{N-2}\\ & & & & & \ldots & & U_{N-2} &-2 U_{N-1} &+ U_N &= h^2 f_{N-1}\\ \end{matrix} \]

Since \(U_0 = \phi\) and \(U_N = \psi\), we have \(N-1\) equations in \({N-1}\) unknowns. These can be arranged in the matrix form as:

\[ \begin{bmatrix} -2 & 1\\ 1 &-2 & 1 & & \ldots & & & \\ & 1 &-2 & 1 & \ldots & & & \\ & & & & \ldots & & & \\ & & & & \ldots & 1 & -2 & 1 \\ & & & & \ldots & & 1 & -2 \end{bmatrix}\begin{bmatrix} U_1 \\ U_2 \\ \vdots\\ U_{N-2} \\ U_{N-1} \end{bmatrix} = \begin{bmatrix} h^2 f_1 - \phi\\ h^2 f_2 \\ \vdots\\ h^2 f_{N-2} \\ h^2 f_{N-1} - \psi \end{bmatrix} \]

or in matrix form \(AU=F\).

Thomas Algorithm

The Thomas algorithm is an efficient way of solving tridiagonal matrix systems. It is based on \(LU\)-decomposition in which the matrix system \(Ax=r\) is written as \(LUx=r\), where \(L\) is a lower-triangular matrix and \(U\) is an upper triangular matrix. The system can be efficiently solved by setting \(Ux=\rho\) and then solving first \(L\rho=r\) and then \(Ux=\rho\) for \(x\). The Thomas algorithm consists of two steps. In step 1, decomposing the matrix \(M = LU\) and solving \(L\rho=r\) are accomplished in a single downwards sweep, taking us straight from \(Ax=r\) to \(Ux=\rho\). In step 2, the equation \(Ux = \rho\) is solved for \(x\) in an upward sweep.

Stage 1

In the first stage, the matrix equation \(Ax=r\) is converted to the form \(Ux=\rho\). Initially, the matrix equation looks like this:

\[ \begin{bmatrix} {\color{blue}b_1} & {\color{blue}c_1} & 0 & 0 & 0 & 0\\ {\color{blue}a_2} & {\color{blue}b_2} & {\color{blue}c_2} & 0 & 0 & 0\\ 0 & {\color{blue}a_3} & {\color{blue}b_3} & {\color{blue}c_3} & 0 & 0\\ 0 & 0 & {\color{blue}a_4} & {\color{blue}b_4} & {\color{blue}c_4} & 0\\ 0 & 0 & 0 & {\color{blue}a_5} & {\color{blue}b_5} & {\color{blue}c_5}\\ 0 & 0 & 0 & 0 & a_6 & b_6 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \\ x_6 \end{bmatrix} = \begin{bmatrix} {\color{blue}r_1} \\ {\color{blue}r_2} \\ {\color{blue}r_3} \\ {\color{blue}r_4} \\ {\color{blue}r_5} \\ {\color{blue}r_6} \end{bmatrix} \]

Row \(1\):

\[ {\color{blue}b_1} x_1 + {\color{blue}c_1} x_2 = {\color{blue}r_1} \]

Dividing throughout by \(\color{blue}b_1\),

\[ x_1 + {\color{blue}\frac{c_1}{b_1}} x_2 = {\color{blue}\frac{r_1}{b_1}} \]

Rewrite:

\[ x_1 + {\color{red}\gamma_1} x_2 = {\color{red}\rho_1}, \quad {\color{red}\gamma_1} = {\color{blue}\frac{c_1}{b_1}}, \quad {\color{red}\rho_1} = {\color{blue}\frac{r_1}{b_1}} \]

\[ \begin{bmatrix} {\color{red}1} & {\color{red}\gamma_1} & 0 & 0 & 0 & 0\\ {\color{blue}a_2} & {\color{blue}b_2} & {\color{blue}c_2} & 0 & 0 & 0\\ 0 & {\color{blue}a_3} & {\color{blue}b_3} & {\color{blue}c_3} & 0 & 0\\ 0 & 0 & {\color{blue}a_4} & {\color{blue}b_4} & {\color{blue}c_4} & 0\\ 0 & 0 & 0 & {\color{blue}a_5} & {\color{blue}b_5} & {\color{blue}c_5}\\ 0 & 0 & 0 & 0 & a_6 & b_6 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \\ x_6 \end{bmatrix} = \begin{bmatrix} {\color{red}\rho_1} \\ {\color{blue}r_2} \\ {\color{blue}r_3} \\ {\color{blue}r_4} \\ {\color{blue}r_5} \\ {\color{blue}r_6} \end{bmatrix} \]

Row \(2\):

\[ {\color{blue}a_2} x_1 + {\color{blue}b_2} x_2 + {\color{blue}c_2} x_3 = {\color{blue}r_2} \]

Use \(a_2\) times row \(1\) of the matrix to eliminate the first term

\[ a_2(x_1 + {\color{red}\gamma_1}x_2 = {\color{red}\rho_1}) \]

\[ \begin{array}{c|cccc} \text{Row 2} & a_2 x_1 &+ b_2 x_2 &+ c_2 x_3 &= r_2\\ a_2 \times \text{Row 1} & a_2 x_1 &+ a_2 \gamma_1 x_2 & &= a_2\rho_1\\ \hline \text{New Row 2} & & (b_2 - a_2 \gamma_1) x_2 &+ c_2 x_3 &= r_2 - a_2 \rho_1 \end{array} \]

Dividing throughout by \((b_2 - a_2 \gamma_1)\), we get:

\[ x_2 + \frac{c_2}{b_2 - a_2 \gamma_1}x_3 = \frac{(r_2 - a_2 \rho_1)}{(b_2 - a_2 \gamma_1)} \]

We can rewrite this as:

\[ x_2 + \gamma_2 x_3 = \rho_2, \quad \gamma_2 = \frac{c_2}{b_2 - a_2 \gamma_1}, \quad \rho_2 = \frac{(r_2 - a_2 \rho_1)}{(b_2 - a_2 \gamma_1)} \]

\[ \begin{bmatrix} 1 & \gamma_1 & 0 & 0 & 0 & 0\\ 0 & 1 & \gamma_2 & 0 & 0 & 0\\ 0 & a_3 & b_3 & c_3 & 0 & 0\\ 0 & 0 & a_4 & b_4 & c_4 & 0\\ 0 & 0 & 0 & a_5 & b_5 & c_5\\ 0 & 0 & 0 & 0 & a_6 & b_6 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \\ x_6 \end{bmatrix} = \begin{bmatrix} \rho_1 \\ \rho_2 \\ r_3 \\ r_4 \\ r_5 \\ r_6 \end{bmatrix} \]

Row \(3\):

\[ a_3 x_2 + b_3 x_3 + c_3 x_4 = r_3 \]

Use \(a_3\) times row \(2\) of the matrix to eliminate the first term:

\[ \begin{array}{c|cccc} \text{Row 3} & a_3 x_2 &+ b_3 x_3 &+ c_3 x_4 &= r_3\\ a_3 \times \text{Row 2} & a_3 x_2 &+ a_3 \gamma_2 x_3 & &= a_3\rho_2\\ \hline \text{New Row 3} & & (b_3 - a_3 \gamma_2) x_3 &+ c_3 x_4 &= r_3 - a_3 \rho_2 \end{array} \]

Dividing throughout by \((b_3 - a_3 \gamma_2)\), we have:

\[ x_3 + \frac{c_3}{b_3 - a_3 \gamma_2} x_4 = \frac{r_3 - a_3\rho_2}{b_3 - a_3 \gamma_2} \]

We can rewrite this as:

\[ x_3 + \gamma_3 x_4 = \rho_3, \quad \gamma_3 = \frac{c_3}{b_3 - a_3 \gamma_2}, \quad \rho_3=\frac{r_3 - a_3 \rho_2}{b_3 - a_3 \gamma_2} \]

Continuing in this fashion, we get:

\[ \begin{bmatrix} 1 & \gamma_1 & 0 & 0 & 0 & 0\\ 0 & 1 & \gamma_2 & 0 & 0 & 0\\ 0 & 1 & 1 & \gamma_3 & 0 & 0\\ 0 & 0 & 0 & 1 & \gamma_4 & 0\\ 0 & 0 & 0 & 0 & 1 & \gamma_5\\ 0 & 0 & 0 & 0 & a_6 & b_6 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \\ x_6 \end{bmatrix} = \begin{bmatrix} \rho_1 \\ \rho_2 \\ \rho_3 \\ \rho_4 \\ \rho_5 \\ r_6 \end{bmatrix} \]

Row \(6\):

\[ a_6 x_5 + a_6 x_6 = r_6 \]

Use \(a_6\) times row 5 of the matrix:

\[a_6(x_5 + \gamma_5 x_6 = \rho_5)\]

\[ \begin{array}{c|cccc} \text{Row 6} & a_6 x_5 &+ b_6 x_6 &= r_6\\ a_6 \times \text{Row 5} & a_6 x_5 &+ a_6 \gamma_5 x_6 &= a_6\rho_5\\ \hline \text{New Row 3} & & (b_6 - a_6 \gamma_5) x_6 &= r_6 - a_6 \rho_5 \end{array} \]

Dividing throughout by \((b_6 - a_6 \gamma_5)\), we can rewrite:

\[ x_6 = \rho_6, \quad \rho_6 = \frac{r_6 - a_6 \rho_5}{b_6 - a_6 \gamma_5} \]

\[ \begin{bmatrix} 1 & \gamma_1 & 0 & 0 & 0 & 0\\ 0 & 1 & \gamma_2 & 0 & 0 & 0\\ 0 & 1 & 1 & \gamma_3 & 0 & 0\\ 0 & 0 & 0 & 1 & \gamma_4 & 0\\ 0 & 0 & 0 & 0 & 1 & \gamma_5\\ 0 & 0 & 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \\ x_6 \end{bmatrix} = \begin{bmatrix} \rho_1 \\ \rho_2 \\ \rho_3 \\ \rho_4 \\ \rho_5 \\ \rho_6 \end{bmatrix} \]

These steps may be summarized as compute the following sequences:

\[ \gamma_1 = \frac{c_1}{b_1}, \quad \rho_1 = \frac{r_1}{b_1} \]

And \[\gamma_j = \frac{c_j}{b_j - a_j \gamma_{j-1}}, \quad \rho_j = \frac{r_j - a_j \rho_{j-1}}{b_j - a_j \gamma_{j-1}}\]

for \(j=2:6\).

At this point, the matrix has been reduced to the upper diagonal form, so our equations are of the form \(Ux = \rho\).

Stage 2

The matrix is now in a form which is trivial to solve for \(x\). We start with the last row and work our way up. The final equation is already solved.

\[ x_6 = \rho_6 \]

\[ \begin{bmatrix} 1 & \gamma_1 & 0 & 0 & 0 & 0\\ 0 & 1 & \gamma_2 & 0 & 0 & 0\\ 0 & 1 & 1 & \gamma_3 & 0 & 0\\ 0 & 0 & 0 & 1 & \gamma_4 & 0\\ 0 & 0 & 0 & 0 & 1 & \gamma_5\\ 0 & 0 & 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \\ x_6 \end{bmatrix} = \begin{bmatrix} \rho_1 \\ \rho_2 \\ \rho_3 \\ \rho_4 \\ \rho_5 \\ \rho_6 \end{bmatrix} \]

Row \(5\): \[ x_5 + \gamma_5 x_6 = \rho_5 \]

Rearrange to get:

\[ x_5 = \rho_5 - \gamma_5 x_6 \]

Row \(4\):

\[ x_4 + \gamma_4 x_5 = \rho_4 \]

Rearrange to get:

\[ x_4 = \rho_4 - \gamma_4 x_5 \]

Continuing in this fashion, we find that, \(x_6 = \rho_6\) and

\[ x_j = \rho_j - \gamma_j x_{j+1} \]

for all \(j=1:5\).

Computational Solution

Let’s quickly code up the algorithm in Julia.

function thomasAlgorithm(a, b, c, r)
    N = size(a)[1]

    # Stage 1
    γ = Array{Float64,1}(undef,N)
    ρ = Array{Float64,1}(undef,N)
    u = Array{Float64,1}(undef,N)

    γ[1] = c[1]/b[1]
    ρ[1] = r[1]/b[1]

    for j=2:N
        γ[j] = c[j]/(b[j] - a[j] * γ[j-1])
        ρ[j] = (r[j] - a[j] * ρ[j-1])/(b[j] - a[j] * γ[j-1])
    end

    # Stage 2
    u[N] =  ρ[N]

    for j=reverse(1:N-1)
        u[j] = ρ[j] - γ[j] * u[j+1]
    end

    return u
end

# Test Case

a = Array{Float64,1}([0, 2, 2, 2])
b = Array{Float64,1}([3, 3, 3, 3])
c = Array{Float64,1}([2, 2, 2, 0])
r = Array{Float64,1}([12, 17, 14, 7])
u = thomasAlgorithm(a, b, c, r)

print(u)
[2.0, 3.0, 1.9999999999999996, 1.0000000000000004]

Here is an implementation in modern C++:

#include <iostream>
#include <memory>
#include <functional>
#include <concepts>

template <typename T>
concept Real = std::integral<T> || std::floating_point<T>;

template <typename T>
requires Real<T>
using Function = std::function<void(std::vector<T>, std::vector<T>, std::vector<T>, std::vector<T>, std::vector<T>&)>;

template <typename T>
requires Real<T>
void thomasAlgorithm(
      std::vector<T> a
    , std::vector<T> b
    , std::vector<T> c
    , std::vector<T> r
    , std::vector<T>&x 
    ){
    //Stage-1
    int N = a.size();
    std::vector<T> gamma(N);
    std::vector<T> rho(N);
    x = std::vector<T>(N);

    gamma[0] = c[0]/b[0]; 
    rho[0] = r[0]/b[0];

    for(int j{1}; j < N; ++j)
    {
        gamma[j] = c[j]/(b[j] - a[j] * gamma[j-1]);
        rho[j] = (r[j] - a[j] * rho[j-1])/(b[j] - a[j] * gamma[j-1]);
    }

    //Stage-2
    x[N-1] = rho[N-1];
    for(int j{N-2}; j >= 0; j--)
    {
        x[j] = rho[j] - gamma[j] * x[j+1];
    }
}

template <typename T>
requires Real<T>
class LUTridiagonalSolver{
    private:
        std::vector<T> m_a;
        std::vector<T> m_b;
        std::vector<T> m_c;
        std::vector<T> m_r;
        std::vector<T> m_x;
        Function<T> m_LUTridiagonalSolverStrategy;
    
    public:
        LUTridiagonalSolver() = default;
        LUTridiagonalSolver(
              std::vector<T> a
            , std::vector<T> b
            , std::vector<T> c
            , std::vector<T> r
            , Function<T> solver) 
            : m_a {std::move(a)}
            , m_b {std::move(b)}
            , m_c {std::move(c)}
            , m_r {std::move(r)}
            , m_LUTridiagonalSolverStrategy {solver} 
            {}

        std::vector<T> solve(){
            m_LUTridiagonalSolverStrategy(m_a, m_b, m_c, m_r, m_x);
            return m_x;
        }

        LUTridiagonalSolver(const LUTridiagonalSolver& ) = delete;
        LUTridiagonalSolver operator=(LUTridiagonalSolver& ) = delete;
        ~LUTridiagonalSolver(){}
};

int main()
{
    std::vector<double> a{0, 2, 2, 2};
    std::vector<double> b{3, 3, 3, 3};
    std::vector<double> c{2, 2, 2, 0};
    std::vector<double> r{12, 17, 14, 7};

    LUTridiagonalSolver<double> solver(a, b, c, r, thomasAlgorithm<double>);
    std::vector<double> u = solver.solve();
    return 0;
}

Compiler Explorer

Deriving the one-dimensional Heat equation

Consider a slender homogenous rod, lying along the \(x\)-axis and insulated, so that no heat can escape across its longitudinal surface. In addition, we make the simplifying assumption that the temperature in the rod is constant on each cross-section perpendicular to the \(x\)-axis, and thus that the flow of heat in the rod takes place only in the \(x\)-direction.

Consider a small segment of the rod at position \(x\) of length \(\Delta x\).

The thermal energy in this segment at time \(t\) is:

\[ E(x,x+\Delta x, t) \approx u(x,t) s \rho \Delta x \]

where \(s\) is the constant of specific heat i.e. amount of heat required to raise one unit of mass by one unit of temperature, \(\rho\) is the mass density.

Fourier’s law of heat conduction quantifies the idea that heat flows from warmer to colder regions and states that the (rightward) heat flux density \(\phi(x,t)\) (the flow of heat energy per unit area per unit time, SI units \(J/s/m^2\)) at any point is:

\[ \phi(x,t) = -K_0 u_x (x, t) \]

where \(K_0\) is the thermal conductivity of the rod. The negative sign shows that the heat flows from higher temperature regions to colder temperature regions.

Appealing to the law of conservation of energy:

\[ \begin{align*} \underbrace{\frac{\partial}{\partial t}(u(x,t) s \rho \Delta x)}_{\text{Heat flux through segment}} = \underbrace{(-K_0 u_x(x, t))}_{\text{Flux in}} - \underbrace{(- K_0 u_x(x + \Delta x,t))}_{\text{Flux out}} \end{align*} \tag{2}\]

Dividing throughout by \(\Delta x\) we have:

\[ \begin{align*} u_t(x,t) \approx \frac{K_0}{s \rho } \frac{u_x(x+\Delta x, t) - u_x(x,t)}{\Delta x} \end{align*} \]

Letting \(\Delta x \to 0\) improves the approximation and leads to the heat equation:

\[ u_t=c^2 u_{xx} \]

where \(c^2 = \frac{K_0}{\rho s}\) is called the thermal diffusivity.

The Crank-Nicolson and Theta methods

Consider the initial boundary value problem for the \(1\)d-heat equation:

\[ \begin{align*} u_t &= a^2 u_{xx}, \quad & 0 < x < L, t>0\\ u(x,0) &= f(x), \quad 0 \leq x \leq L \\ u(0,t) &= A, \\ u(L,t) &= B \end{align*} \]

In this case, we can assume without the loss of generality that \(L = 1\). Here, \(a\), \(A\) and \(B\) are constants.

We find a solution to this system in the case when \(A = B = 0\) and \(a = 1\) by the method of the separation of variables. In this case, the analytical solution is:

\[ u(x,t) = \frac{8}{\pi^2}\sum_{n=1}^{\infty}\frac{1}{n^2}\sin\left(\frac{n\pi}{2}\right)\sin(n\pi x)\exp(-n^2 \pi^2 t) \]

and we are going to use this solution as a benchmark against which the numerical solutions can be compared.

We can discretize a parabolic PDE in the space dimension (using centered difference schemes) while keeping the time variable continuous. We examine the following initial boundary value problem for the \(1\)d-heat equation on the unit interval with zero Dirichlet boundary conditions.

The problem is:

\[ \begin{align*} u_t &= u_{xx}, \quad & 0 < x < 1, t>0\\ u(x,0) &= f(x), \quad 0 \leq x \leq 1 \\ u(0,t) &= u(1,t) = 0 \end{align*} \tag{3}\]

We partition the space interval \((0,1)\) into \(J\) subintervals and we approximate Equation 3 by the semi-discrete scheme:

\[ \begin{align*} \frac{dU_j}{dt} &= \frac{1}{h^2}(U_{j+1} - 2U_j + U_{j-1}), \quad 1 \leq j \leq J-1 \\ U_0(t) &= U_J(t) = 0, \quad t > 0 \\ U_j(0) &= f(x_j) \end{align*} \]

where \(h = 1/J\) is the constant mesh size. The \(U_j\)’s are functions of time \(t\). So, we define the following vectors:

\[ \begin{align*} U(t) &= (U_1(t),U_2(t),\ldots,U_J(t))^T \\ U^0 &= (f(x_1),f(x_2),\ldots,f(x_{J-1}))^T \end{align*} \]

Then, we can rewrite the system Equation 3 as a system of ordinary differential equations:

\[ \begin{align*} \frac{dU}{dt} &= AU\\ U(0) &= U^0 \end{align*} \tag{4}\]

where the matrix \(A\) is given by:

\[ A = \frac{1}{h^2}\begin{bmatrix} -2 & 1 & 0 & \ldots \\ 1 &-2 & 1 & \ldots \\ 0 & 1 & -2 & \ldots \\ & & & \ldots \\ & & & \ldots & 1 & -2 & 1\\ & & & \ldots & 0 & 1 & -2\\ \end{bmatrix} \]

There are many discretization schemes. I plan to explore various finite difference schemes and their application to derivatives pricing in future posts. For now, I will concentrate on the one-step explicit and implicit methods to discretise the system of ODEs(Equation 4) as:

\[ \begin{align*} \frac{U^{n+1 - U^n}}{\Delta t} &= \theta AU^{n+1} + (1-\theta)AU^{n}, \quad 0 \leq n \leq N-1, 0 \leq \theta \leq 1 \\ U^{0} &= U(0) \end{align*} \tag{5}\]

In this case, \(\Delta t\) is the constant mesh size in time.

We can rewrite Equation 5 in the equivalent form:

\[ \begin{align*} U^{n+1} - U^{n} &= \theta \Delta t A U^{n+1} + \Delta t (I- \theta)AU^{n} \\ [I - \Delta t A]U^{n+1} &= (\Delta t (1 - \theta) + 1)AU^n \end{align*} \tag{6}\]

or formally as:

\[ \begin{align*} U^{n+1} = [1-\Delta t \theta A]^{-1} (I + \Delta t(I - \theta)) A U^n \end{align*} \tag{7}\]

where \(I\) is the identity matrix.

Some special cases of \(\theta\) are:

\[ \begin{align*} \theta &= 1, \quad \text{Implicit Euler Scheme}\\ \theta &= 0, \quad \text{Explicit Euler Scheme}\\ \theta &= 1/2,\quad \text{Crank-Nicolson Scheme} \end{align*} \tag{8}\]

When the schemes are implicit, we can solve the system of equations (Equation 5) at each time level \(n+1\) using the Thomas algorithm. No matrix inversion is needed in the case of explicit schemes. The formulation (Equation 4) is called the method of lines and it corresponds to semi-discretization of the PDE in the space direction while keeping the time variable continuous (I will explore MOL in future posts).

We can write the scheme (Equation 6 - Equation 8) in the component form:

\[ \begin{align*} \frac{U^{n+1}j - U^{n}j}{\Delta t} = \theta(U^{n+1}_{j+1}-2U^{n+1}_j + U^{n+1}_{j-1})/h^2 + (1-\theta)(U^{n}_{j+1}-2U^{n}_j + U^{n}_{j-1})/h^2 \end{align*} \tag{9}\]

or equivalently:

\[ \begin{align*} {U^{n+1}j - U^{n}_j} &= \lambda\theta(U^{n+1}_{j+1}-2U^{n+1}_j + U^{n+1}_{j-1})/h^2 \\&+ (1-\theta)(U^{n}_{j+1}-2U^{n}_j + U^{n}_{j-1}) \end{align*} \]

where \(\lambda = \Delta t/h^2\).

Finally:

\[ \begin{align*} -\lambda \theta U^{n+1}_{j+1} + (1+2\lambda \theta)U_j^{n+1} - \lambda \theta U^{n+1}_{j-1} \\= \lambda (1-\theta)U^{n}_{j+1}+(1-2\lambda(1-\theta))U^{n}_j + \lambda(1-\theta)U^{n}_{j-1} \end{align*} \tag{10}\]

The system (Equation 10) is tridiagonal and we can apply the Thomas algorithm to solve it. In the case of the explicit Euler scheme \((\theta = 0)\), these algorithms are not needed, because the solution at time level \(n+1\) can be explicitly computed:

\[ U^{n+1}_j = \lambda U^{n}_{j+1} + (1-2\lambda)U^{n}_j +\lambda U^{n}_{j-1} \]

Computational Solution