12 Partial Differential Equations
12.1 Partial differential equations: Overview
We will now consider approximation methods for partial differential equations (PDEs). An example for a PDE – familiar from the first-year Calculus course – is the heat equation: \[ \frac{\partial^2}{\partial x^2} u(x,t) = \kappa \frac{\partial}{\partial t} u(x,t) \tag{12.1}\] with some constant \(\kappa > 0\). PDEs like the above are defined on a certain region in the \(x\)-\(t\)-plane (or analogously in more than 2 dimensions), and are always complemented by some kind of boundary conditions on the boundary of that region.
We will try to generalize our methods for boundary value problems of ODEs to the case of PDEs. The BVP methods we have discussed are:
the Shooting method,
the Finite Difference method,
the Rayleigh-Ritz method.
Unfortunately, there is no obvious generalization of the Shooting method to PDEs; the concept of transforming a boundary value problem into an initial value problem does not work in this context.
However, the Finite Difference method can be generalized to PDEs. To that end, we would first need to define a suitable notion of mesh points. Instead of dividing an interval into equally spaced subintervals, we now need to divide a region in two variables (say, a rectangle) with an equally spaced grid of mesh points. Then, as before, one can replace the value of \(u\) at mesh points with an approximation value, and the derivatives of \(u\) with finite difference quotients, and finally solve a (linear) equation system to obtain the approximation values numerically. We will discuss this more in detail in later sections.
The Finite Difference method has limitations when the region in question is not as simple as a rectangle, but is irregularly shaped. In this case, it may not be possible to divide it reasonably with an equally spaced grid, or the mesh points at the edge of the grid may not be located exactly on the boundary (which poses problems when interpreting the boundary values). In these cases, generalizations of the Rayleigh-Ritz method can successfully be used, the so-called Finite Element methods. Note that in the Rayleigh-Ritz method, there was a large freedom of choosing the basis functions \(\phi_j\), and even when restricting to piecewise linear functions, the mesh points \(x_i\) did not need to be equally spaced. Likewise, in the multi-dimensional generalizations, one can exploit this freedom to adapt the choice of mesh points to a (possibly irregular) boundary. Finite Element methods are the most advanced numerical methods for solving PDEs, and we will not discuss them in detail here; see, e.g., (Burden and Faires 2010 Ch. 12.4) for an introduction.
12.2 Elliptic PDEs
In this section, we will generalize the Finite Difference method to a very specific PDE, namely the Poisson equation. This equation for a function \(u\) of two variables \(x\) and \(y\) has the form
\[ \begin{split} &\frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} = f(x,y) \\ &\text{on } R := \{ (x,y) : a \leq x \leq b, c \leq y \leq d \},\\ &u(x,y) = g(x,y) \text{ for } x \in \partial R. \end{split} \tag{12.2}\] Here \(f\) and \(g\) are given functions of two variables, and \(\partial R\) denotes the boundary of \(R\), that is, the four line segments \[ \begin{split} &x=a, c \leq y \leq d; \quad x=b, c \leq y \leq d; \quad\\ &a \leq x \leq b, y=c; \quad a \leq x \leq b, y=d. \end{split} \tag{12.3}\]
The equation Eq. 12.2 is, in the case \(f=0\), also known as the Laplace equation. It is a typical example of the larger class of elliptic differential equations, and the methods we will discuss here apply to other elliptic equations as well.
We will try to set up a Finite Difference method to approximate the solution of Eq. 12.2, following our recipe from Section 11.3. To that end, we first have to specify what our mesh points are. Instead of partitioning the interval \([a,b]\), they will now need to partition the rectangle \(R\). To this end, we choose two numbers of steps, \(N\) and \(M\), associated with the \(x\) and \(y\) direction respectively, and two corresponding step sizes \[ h := \frac{b-a}{N+1}, \quad k := \frac{d-c}{M+1}. \] We then define \(N \times M\) mesh points \((x_i, y_j)\) as \[ (x_i, y_j) = (a + i h, c + j k), \quad 1 \leq i \leq N, \; 1 \leq j \leq M. \] These lie on a rectangular grid within the rectangle \(R\).
Our next step is to approximate the relevant derivatives of \(u\) at the mesh points. We use the centred difference formula Eq. 11.43 twice, once in \(x\) direction (at fixed \(y\)) and once in \(y\) direction (at fixed \(x\)). This gives \[ \begin{aligned} \frac{\partial^2 u}{\partial x^2} (x_i,y_j) = \frac{ u(x_{i+1},y_j) - 2 u(x_i,y_j) + u(x_{i-1},y_j) }{h^2} - \frac{h^2}{12} \frac{\partial^4 u}{\partial x^4}(\xi_i,y_j), \\ \frac{\partial^2 u}{\partial y^2} (x_i,y_j) = \frac{ u(x_{i},y_{j+1}) - 2 u(x_i,y_j) + u(x_{i},y_{j-1}) }{k^2} - \frac{k^2}{12} \frac{\partial^4 u}{\partial y^4}(x_i,\eta_j), \end{aligned} \tag{12.4}\] where \(\xi_i\) and \(\eta_j\) are some unknown points in the intervals.
Again following our recipe, we will approximate the solution at mesh points \(u(x_i,y_j)\) with approximation values \(w_{i,j}\). In the PDE Eq. 12.2, we then replace \(u(x_i,y_j)\) with \(w_{i,j}\) and the derivatives of \(u\) with the expressions Eq. 12.4, but leaving away the remainder terms of order \(O(h^2)+O(k^2)\). This yields the following equations for the \(w_{i,j}\): \[ \frac{ w_{i+1,j} - 2 w_{i,j} + w_{i-1,j} }{h^2} + \frac{ w_{i,j+1} - 2 w_{i,j} + w_{i,j-1} }{k^2} = f(x_i,y_j) \] for \(1 \leq i \leq N, \; 1 \leq j \leq M\). After multiplying with \(-h^2\), \[ 2 \Big(\big(\tfrac{h}{k}\big)^2+1\Big) w_{i,j} - w_{i+1,j} - w_{i-1,j} - \big(\tfrac{h}{k}\big)^2 w_{i,j+1} - \big(\tfrac{h}{k}\big)^2 w_{i,j-1} = -h^2 f(x_i,y_j). \tag{12.5}\]
The boundary conditions are then expressed by setting the values of \(w_{i,j}\) for \(i=0\), \(j=0\), \(i=N+1\) or \(j=M+1\) to the required boundary values: \[ \begin{aligned} w_{0,j} &= g(x_0,y_j), &w_{N+1,j} &= g(x_{N+1},y_j) & \text{for } 1 \leq j \leq M, \\ w_{i,0} &= g(x_i,y_0), &w_{i,M+1} &= g(x_{j},y_{M+1}) & \text{for } 1 \leq i \leq N. \end{aligned} \tag{12.6}\] Together, Eq. 12.5 and Eq. 12.6 form a linear equation system that can be solved to obtain approximation values \(w_{i,j}\). To that end, it may be useful to renumber the “double indices” \(i,j\) with a single index, setting, e.g., \(\ell := N(i-1)+j\). The index \(k\) then runs from \(1\) to \(NM\), and we may rewrite the equation system as \(\mathbf{A}\mathbf{w}= \mathbf{b}\) with a \(NM \times NM\) matrix \(\mathbf{A}\) and a vector \(\mathbf{b}\in\mathbb{R}^{NM}\).
As in the one-dimensional case, the matrix \(\mathbf{A}\) is sparse, that is, many of its entries are known to be zero. (It is not tridiagonal, however.) This makes the system \(\mathbf{A}\mathbf{w}= \mathbf{b}\) fast to solve. Nevertheless, it is evident that the number of mesh points (and hence of vector dimensions) can become rather large quite easily, which sets practical limits to the accuracy of this and other approximation methods for PDEs.
12.3 Parabolic PDEs
As a second class of PDEs to be treated with the Finite Difference method, we consider parabolic PDEs. A typical example – and the only one we will consider – is the one-dimensional heat equation (also known as diffusion equation). This equation for a function \(u\) of two variables \(x\) and \(t\) has the form1 \[ \begin{aligned} &\alpha^2 \frac{\partial^2 u}{\partial x^2} = \frac{\partial u}{\partial t} \quad \text{on } R := \{ (x,t) : 0 \leq x \leq L, 0 \leq t \},\\ &u(x,0) = g(x) \text{ for } 0 < x < L,\\ &u(0,t) = u(L,t) = 0 \text{ for all } t > 0. \end{aligned} \tag{12.7}\] The boundary of the region \(R\) consists of three pieces here. One also refers to the condition \(u(x,0) = g(x)\) as initial condition; as we will see, it has some similarities to initial conditions for ODEs.
In applications, \(u(x,t)\) might have the interpretation of a local temperature – say, in a homogeneous wall of width \(L\) – depending on the spatial position \(x\) and on time \(t\). The initial condition is the temperature in the wall at time 0, and the remaining boundary conditions represent the temperature of the environment, depending on time \(t\).
Again, we will set up a Finite Difference method to approximate the solution of Eq. 12.7, following our recipe from Section 11.3. As a first step, we restrict the region \(R\) to a rectangle, introducing the condition \(0 \leq t \leq T\) (with some fixed \(T>0\)). Then, as for the Laplace equation, we introduce two numbers of steps, \(N\) and \(M\), associated with the \(x\) and \(t\) direction respectively, and two corresponding step sizes \[ h := \frac{L}{N+1}, \quad k := \frac{T}{M} \] (note the slightly different convention for \(k\) from before). Once more, we define mesh points \((x_i, t_j)\) as \[ (x_i, t_j) = (i h, j k), \quad 0 \leq i \leq N+1, \; 0 \leq j \leq M. \]
The next step is to approximate the derivatives of \(u\), and here the very specific properties of the equation Eq. 12.7 show up. In the variable \(x\), we once more use the centred difference formula Eq. 11.43, leading to \[ \frac{\partial^2u}{\partial x^2}u(x_{i},t_j) = \frac{ u(x_{i+1},t_j) - 2 u(x_i,t_j) + u(x_{i-1},t_j) }{h^2} + O(h^2). \tag{12.8}\]
However, for the derivative by \(t\), we cannot follow the same approach. The problem here is that the centred difference formula for the first derivative would involve \(u(x_i,t_{j+1})\) and \(u(x_i,t_{j-1})\), but we have only one boundary (or initial) condition to fix the value if the mesh point is on the boundary of \(R\), and this would lead to an under-determined linear equation system later. As a way out, we use the (much simpler) forward difference formula, \[ \frac{\partial u}{\partial t}u(x_{i},t_j) = \frac{ u(x_{i},t_{j+1}) - u(x_i,t_j) }{k} + O(k). \tag{12.9}\]
Further following our recipe, we insert the finite difference formulas Eq. 12.8 and Eq. 12.9 into the PDE Eq. 12.7, then replace \(u(x_i,y_j)\) with approximation values \(w_{i,j}\) and leave away the remainder terms of order \(O(h^2)+O(k)\). This yields the following equations for the \(w_{i,j}\): \[ \alpha^2 \frac{ w_{i+1,j} - 2 w_{i,j} + w_{i-1,j} }{h^2} = \frac{ w_{i,j+1} - w_{i,j} }{k} , \quad 1 \leq i \leq N, \; 0 \leq j < M. \] or, setting \(\lambda := \alpha^2 k / h^2\), \[ w_{i,j+1} = (1-2\lambda) w_{i,j} + \lambda (w_{i+1,j}+w_{i-1,j}) \tag{12.10}\]
where \(w_{0,j}=w_{N+1,j}=0\) for all \(j\) (boundary condition) and \(w_{i,0} = g(x_i)\) for all \(i\) (initial condition).
An interesting point is that the linear equation system Eq. 12.10 is extremely easy to solve: Namely, inserting the known values \(w_{i,0}\) into the right-hand side gives us \(w_{i,1}\) for all \(i\); again inserting these into the right hand side gives us \(w_{i,2}\) for all \(i\); and so forth. We can rewrite this procedure in matrix form: Setting \[ \begin{aligned} \mathbf{w}_j &:= (w_{1,j},\ldots, w_{N,j}),\\ \mathbf{A}_+ &:= \begin{pmatrix} (1-2\lambda) & \lambda& 0 & & \cdots & 0 \\ \lambda & (1-2\lambda) & \lambda& 0 & \cdots & 0 \\ 0 & \lambda & (1-2\lambda) & \lambda& \cdots & 0 \\ 0 & \ddots & \ddots & \ddots & \ddots & \lambda \\ 0 & \cdots & & 0 & \lambda & (1-2\lambda) \end{pmatrix} \quad \text{(an $N \times N$ matrix)}, \end{aligned} \] we obtain the relation \[ \mathbf{w}_{j+1} = \mathbf{A}_+ \mathbf{w}_j . \] Thus, starting with the initial value \(\mathbf{w}_0\), the approximation can be obtained by an iterated matrix multiplication. This approximation method is called the Forward Difference method.
The Forward Difference method is very simple to apply, but it has a major disadvantage: it becomes unstable if the the step size \(k\) is not chosen very small. This phenomenon is closely related to the one we saw for stiff equations in Section 10.9. We will skip a more in-depth analysis here; see (Burden and Faires 2010, sec. 12.2).
We do, however, want to present a solution to the stability problem here, which is similar to the one found for stiff equations: we use an “implicit method” for approximation, the Backward Difference method. To that end, instead of the forward difference formula Eq. 12.9, we use the backward difference formula
\[ \frac{\partial u}{\partial t} (x_i,t_j) = \frac{ u(x_{i},t_{j}) - u(x_i,t_{j-1}) }{k} + O(k). \tag{12.11}\] Leaving all other construction steps the same, we arrive at another method of order \(O(h^2+k)\) where the equation system Eq. 12.10 is now replaced by
\[ w_{i,j-1} = (1+2\lambda) w_{i,j} - \lambda (w_{i+1,j}+w_{i-1,j}). \tag{12.12}\]
Again, we rewrite this in matrix form: with \[ \begin{aligned} \mathbf{w}_j &:= (w_{1,j},\ldots, w_{N,j}),\\ \mathbf{A}_- &:= \begin{pmatrix} (1+2\lambda) & -\lambda& 0 & & \cdots & 0 \\ -\lambda & (1+2\lambda) & -\lambda& 0 & \cdots & 0 \\ 0 & -\lambda & (1+2\lambda) & -\lambda& \cdots & 0 \\ 0 & \ddots & \ddots & \ddots & \ddots & -\lambda\\ 0 & \cdots & & 0 & -\lambda & (1+2\lambda) , \end{pmatrix}, \end{aligned} \] we can rewrite Eq. 12.12 as \[ \mathbf{w}_{j-1} = \mathbf{A}_- \mathbf{w}_j . \] This does no longer explicitly give \(\mathbf{w}_j\) from \(\mathbf{w}_{j-1}\). However, this is only a matter of matrix inversion (or, equivalently, solving linear equation systems) as we clearly have \[ \mathbf{w}_{j} = \mathbf{A}_-^{-1} \mathbf{w}_{j-1} . \] Starting from \(\mathbf{w}_0\), this again allows us to compute all approximation values by iterative matrix multiplication (or iteratively solving linear equation systems). Since the matrix \(\mathbf{A}\) is sparse (tridiagonal), this can be done very efficiently. Thus the Backward Difference method is only slightly more complex than the Forward Difference method. It does, however, not suffer from stability problems.
We can reach a unified view of the two schemes by considering the \(N\times N\) tridiagonal matrix \[ B=\begin{pmatrix} 2 & -1 & 0 & & \cdots & 0 \\ -1 & 2 & -1& 0 & \cdots & 0 \\ 0 & -1 & 2 & -1& \cdots & 0 \\ 0 & \ddots & \ddots & \ddots & \ddots & -1\\ 0 & \cdots & & 0 & -1 & 2 \end{pmatrix} \] Applied to a vector \(y\in\mathbb{R}^N\), it gives \[ -2y_1+y_2, \dots, y_{i+1}-2y_i+y_{i+1},\dots, y_{N-1}-2y_N \] Apart from a factor of \(1/h^2\), this is exactly the symmetric difference formula for the second derivative, applied the sequence of values \(0,y_1,y_2,\dots,y_N,0\).
Consider Euler’s method, with step length \(k\): \[ \mathbf{w}_{j+1}=\mathbf{w}_j+k(\text{derivative}) \] In our problem, the time derivative is \(\alpha^2\) times the second space derivative, which can be calculated using the matrix \(B\). Once the scale factors are taken into account, Euler’s method leads us to \[ \mathbf{w}_{j+1}=\mathbf{w}_j+\lambda B\mathbf{w}_j=(I+\lambda B)\mathbf{w}_j=A_+\mathbf{w}_j \] which is exactly the forward difference method above. The backward difference method is given by \[ \mathbf{w}_{j+1}=\mathbf{w}_j+\lambda B\mathbf{w}_{j+1} \] in which we think of the one-sided difference quotient as an approximation for the derivative at the right-hand endpoint, not the left-hand endpoint. This rearranges to \[ \mathbf{w}_j=(I-\lambda B)\mathbf{w}_{j+1}=A_-\mathbf{w}_{j+1}, \] which is exactly the backward difference method above. As a single-step method for ODEs, this is known as the backward Euler or implicit Euler method.
Finally, analogously to the implicit trapezoidal method, we can use the average of the steps from the forward and backward difference method to give the Crank-Nicolson method. \[ \mathbf{w}_{j+1}=\mathbf{w}_j+\frac{1}{2}\left(\lambda B\mathbf{w}_j+\lambda B\mathbf{w}_{j+1}\right) \] Rearranging this gives us \[ (I-\lambda B/2)\mathbf{w}_{j+1}=(I+\lambda B/2)\mathbf{w}_j \] Like the backward difference method, this gives us a tridiagonal system of equations to solve to get from \(\mathbf{w}_j\) to \(\mathbf{w}_{j+1}\), and can be used to find \(\mathbf{w}_j\) for any \(j\).
It turns out that the error terms for the forward and backward difference methods have the form \(Ck+O(k^2)\) and \(-Ck+O(k^2)\). Taking the average cancels the \(\pm Ck\) terms and leaves an error of order \(O(k^2)\); in combination with the space variable, we have \(O(h^2)+O(k^2)\) for the whole method, as compared with \(O(h^2)+O(k)\) for the forward and backward difference methods. Like the implicit trapezoidal method, the Crank-Nicolson method is absolutely stable.
In the form presented here, the equation is actually explicitly solvable in terms of a Fourier series. One might ask therefore why numerical approximation methods are necessary. The answer is that the boundary conditions \(u(0, t) = u(L, t) = 0\) chosen here are rather simplistic, which was done to simplify the discussion. But the numerical method can be generalized to more intricate boundary conditions where an explicit solution is no longer feasible.↩︎