2022-10-24 Galerkin methods#

Last time#

  • Algebraic multigrid

  • Smoothed aggregation

  • Multigrid in PETSc

  • Projects

Today#

  • Questions about projects

  • Galerkin methods

    • Weak forms

    • \(L^2\) Projection

    • Galerkin “optimality”

using Plots
default(linewidth=3)
using LinearAlgebra
using SparseArrays

function my_spy(A)
    cmax = norm(vec(A), Inf)
    s = max(1, ceil(120 / size(A, 1)))
    spy(A, marker=(:square, s), c=:diverging_rainbow_bgymr_45_85_c67_n256, clims=(-cmax, cmax))
end
my_spy (generic function with 1 method)

Searching for projects#

GitHub codesearch#

  • Check Insights -> Contributors

    • Sustained versus short-term activity

    • Number of stakeholders

    • git shortlog -se --since=2016 | sort -n

  • Community signals

    • Code of Conduct

    • Contributing guidelines, pull request templates

    • Active review and mentoring in pull requests

  • Continuous integration

    • Actions tab, also report in pull requests

    • Pipelines for repositories on GitLab

Journals#

Journal of Open Source Software#

  • Papers on mature software with many users

  • Immature research software with a few users, but best practices and intent to grow

Geoscientific Model Development#

  • Blends software and methods work

Integration by parts#

One dimension#

it’s the product rule backwards

(42)#\[\begin{align} \int_a^b d(uv) &= \int_a^b u dv + \int_a^b v du \\ (uv)_a^b &= \int_a^b u dv + \int_a^b v du \\ \int_a^b u dv &= (uv)_a^b - \int_a^b v du \end{align}\]

you can move the derivative to the other term; it’ll cost you a minus sign and a boundary term

Multiple dimensions#

(43)#\[\begin{align} \int_\Omega v \nabla\cdot \mathbf f = -\int_\Omega \nabla v \cdot \mathbf f + \int_{\partial \Omega} v \mathbf f \cdot \mathbf n \end{align}\]

Strong form#

\[ -\nabla\cdot(\kappa \nabla u) = 0 \]

Weak form#

  • multiply by a test function and integrate by parts

(44)#\[\begin{align} -\int_\Omega v \nabla\cdot(\kappa \nabla u) = 0, \forall v \\ \int_\Omega \nabla v \cdot \kappa \nabla u - \int_{\partial\Omega} v \underbrace{\kappa \nabla u \cdot \mathbf n}_{\text{boundary condition}} = 0 \end{align}\]

Variational: from minimization to weak forms#

Minimization#

Find \(u\in \mathcal V\) such that

(45)#\[\begin{align} u &= \arg\min \int_\Omega \frac 1 p \lVert \nabla u \rVert^p \\ &= \arg\min \int_\Omega \frac 1 p (\nabla u \cdot \nabla u)^{p/2} \end{align}\]

Variations#

If \(f(x)\) is a minimum, then \(df = f'(x) dx = 0\) for all \(dx\).

So if \(u \in \mathcal V\) is a minimizer, then

(46)#\[\begin{align} \int_\Omega \underbrace{\lVert \nabla u \rVert^{p-2}}_{\kappa(u)} \nabla u \cdot \nabla du = 0 \end{align}\]

for all \(du \in \mathcal V\).

  • It’s common to use \(v\) in place of \(du\).

  • Our discretization will reflect this structure

\[ v_h^T F(u_h) \sim \int_\Omega \nabla v \cdot \kappa(u) \cdot \nabla u \]

Finite elements#

Problem statement#

Find \(u \in \mathcal V\) such that

(47)#\[\begin{align} \int_\Omega \nabla v \cdot \kappa \nabla u - \int_{\partial\Omega} v \underbrace{\kappa \nabla u \cdot \mathbf n}_{\text{boundary condition}} = 0, \forall v \in \mathcal V \end{align}\]

What do we need?#

  • A way to integrate

  • A space \(\mathcal V\)

  • Boundary conditions

Ciarlet (1978) definition#

A finite element is a triple \((K, P, N)\) where

  • \(K\) is a bounded domain with piecewise smooth boundary

  • \(P = \{p_j\}_1^k\) is a finite-dimensional function space over \(K\) (“prime basis”)

  • \(N = \{n_i\}_1^k\) is a basis for the dual space (“nodes”)

It is common for nodes to be pointwise evaluation

\[ \langle n_i, p_j \rangle = p_j(x_i) \]

First: choose a single element \(K = \Omega\)#

Quadrature#

Our elements will be a line segment and we’ll define quadrature on the standard line segment, \(\hat K = (-1, 1)\). A quadrature is a set of points \(q_i\) and weights \(w_i\) such that

\[ \int_{-1}^1 f(x) \approx \sum_{i=1}^n w_i f(q_i) \]

default(linewidth=4)
function quad_trapezoid(n)
    q = LinRange(-1, 1, n)
    w = 2 * ones(n) / (n - 1)
    w[[1,end]] ./= 2
    q, w
end

function plot_quad_accuracy(f, F, quad)
    ns = 2 .^ (2:7)
    exact = F(1) - F(-1)
    function err(n)
        q, w = quad(n)
        w' * f.(q) - exact
    end
    errors = [abs.(err(n)) for n in ns]
    plot(ns, errors, marker=:auto, label=quad)
    plot!(ns, ns.^(-2), xscale=:log10, yscale=:log10)    
end
plot_quad_accuracy (generic function with 1 method)
plot_quad_accuracy(exp, exp, quad_trapezoid)
plot_quad_accuracy(x -> 3*x^2, x -> x^3, quad_trapezoid)
../_images/2022-10-24-galerkin_18_0.svg

Integration via polynomial interpolation#

  • Pick some points \(x_i\)

  • Evaluate the function there \(f_i = f(x_i)\)

  • Find the polynomial that interpolates the data

  • Integrate the polynomial

What order of accuracy can we expect?#

  • What degree polynomials can be integrate exactly?

Doing better: Gauss quadrature#

Suppose a polynomial on the interval \([-1,1]\) can be written as

\[ P_n(x) q(x) + r(x) \]

where \(P_n(x)\) is the \(n\)th Legendre polnomials and both \(q(x)\) and \(r(x)\) are polynomials of maximum degree \(n-1\).

  • Why is \(\int_{-1}^1 P_n(x) q(x) = 0\)?

  • Can every polynomials of degree \(2n-1\) be written in the above form?

  • How many roots does \(P_n(x)\) have on the interval?

  • Can we choose points \(\{x_i\}\) such that the first term is 0?

If \(P_n(x_i) = 0\) for each \(x_i\), then we need only integrate \(r(x)\), which is done exactly by integrating its interpolating polynomial. How do we find these roots \(x_i\)?

Gauss-Legendre in code#

  1. Solve for the points, compute the weights

  • Use a Newton solver to find the roots. You can use the recurrence to write a recurrence for the derivatives.

  • Create a Vandermonde matrix and extract the first row of the inverse or (using more structure) the derivatives at the quadrature points.

  1. Use duality of polynomial roots and matrix eigenvalues.

  • A fascinating mathematical voyage (topic of graduate linear algebra class).

function gauss_legendre(n)
    """Gauss-Legendre integration using Golub-Welsch algorithm"""
    beta = @. .5 / sqrt(1 - (2 * (1:n-1))^(-2))
    T = diagm(-1 => beta, 1 => beta)
    D, V = eigen(T)
    w = V[1,:].^2 * 2
    q = D
    q, w
end
gauss_legendre (generic function with 1 method)
plot_quad_accuracy(exp, exp, gauss_legendre)
plot_quad_accuracy(x -> 11*x^10, x -> x^11, gauss_legendre)
../_images/2022-10-24-galerkin_23_0.svg

Legendre polynomials#

function vander_legendre(x, k=nothing)
    if isnothing(k)
        k = length(x) # Square by default
    end
    m = length(x)
    Q = ones(m, k)
    Q[:, 2] = x
    for n in 1:k-2
        Q[:, n+2] = ((2*n + 1) * x .* Q[:, n+1] - n * Q[:, n]) / (n + 1)
    end
    Q
end
vander_legendre (generic function with 2 methods)
x = LinRange(-1, 1, 50)
P = vander_legendre(x, 6)
plot(x, P)
../_images/2022-10-24-galerkin_26_0.svg

Prime to nodal basis#

A nodal basis \(\{ \phi_j(x) \}\) is one that satisfies \( n_i(\phi_j) = \delta_{ij} . \)

We write \(\phi_j\) in the prime basis by solving with the generalized Vandermonde matrix \(V_{ij} = \langle n_i, p_j \rangle\),

\[ \phi_j(x) = \sum_k p_k(x) (V^{-1})_{k,j} . \]

k = 10
xn = LinRange(-1, 1, k)
V = vander_legendre(xn)
xx = LinRange(-1, 1, 50)
Pxx = vander_legendre(xx, k)
Pn = Pxx / V
plot(xx, Pn)
scatter!(xn, one.(xn))
../_images/2022-10-24-galerkin_28_0.svg
using FastGaussQuadrature
xn, _ = gausslobatto(k)
V = vander_legendre(xn)
Pn = Pxx / V
plot(xx, Pn)
scatter!(xn, one.(xn))
../_images/2022-10-24-galerkin_29_0.svg
function vander_legendre_deriv(x, k=nothing)
    if isnothing(k)
        k = length(x) # Square by default
    end
    m = length(x)
    Q = ones(m, k)
    dQ = zeros(m, k)
    Q[:, 2] = x
    dQ[:, 2] .= 1
    for n in 1:k-2
        Q[:, n+2] = ((2*n + 1) * x .* Q[:, n+1] - n * Q[:, n]) / (n + 1)
        dQ[:, n+2] = (2*n + 1) * Q[:,n+1] + dQ[:,n]
    end
    Q, dQ
end
vander_legendre_deriv (generic function with 2 methods)

Galerkin method: \(L^2\) projection#

A nice test problem that doesn’t require derivatives or boundary conditions: Find \(u \in \mathcal V_h\) such that

\[ \int_{-1}^1 v(x) \big[ u(x) - f(x) \big] = 0, \quad \forall v \in \mathcal V_h\]

function febasis(P, Q)
    x, _ = gausslobatto(P)
    q, w = gausslegendre(Q)
    Pk, _ = vander_legendre_deriv(x)
    Bp, Dp = vander_legendre_deriv(q, P)
    B = Bp / Pk
    D = Dp / Pk
    x, q, w, B, D
end

function L2_galerkin(P, Q, f)
    x, q, w, B, _ = febasis(P, Q)
    A = B' * diagm(w) * B
    rhs = B' * diagm(w) * f.(q)
    u = A \ rhs
    x, u
end
L2_galerkin (generic function with 1 method)
f(x) = tanh(5*x)
P, Q = 8, 8
x, u = L2_galerkin(P, Q, f)
error = u - f.(x)
plot(x, u, marker=:auto, legend=:bottomright)
plot!(f, title="Galerkin error: $(norm(error))")
../_images/2022-10-24-galerkin_33_0.svg

Embed the solution in the continuous space#

In thel last figure, the Galerkin solution was represented by a piecewise linear function, emphasizing its nodal values. Let’s make a nicer plot by multiplying by the corresponding nodal basis function.

P, Q = 10, 12
x, u = L2_galerkin(P, Q, f)
error = u - f.(x)
_, xx, _, BB, _ = febasis(P, 100)
plot(f, label="exact", title="Galerkin error: $(norm(error))",
  xlim=(-1, 1), legend=:bottomright)
plot!(xx, BB * u, label="Galerkin solution")
scatter!(x, u, marker=:circle, color=:black, label="Nodal values")
../_images/2022-10-24-galerkin_35_0.svg
ns = 2 .^ (2:8)
function L2_error(n)
    x, u = L2_galerkin(n, n, f)
    norm(u - f.(x))
end
plot(ns, L2_error.(ns), marker=:circle, yscale=:log10,
  title="Convergence", xlabel="number of points", ylabel="error")
../_images/2022-10-24-galerkin_36_0.svg

Galerkin error estimates#

We introduce the notation

\[ a(v, u) = \int_\Omega \nabla v(x) \cdot \nabla u(x) \]
and note that \(a\) is

  • bilinear (linear in each of its arguments)

  • symmetric: \(a(u, v) = a(v,u)\)

  • positive definite: \(a(u, u) > 0\) when \(u \ne 0\) thus defines an inner product on the function space \(V\).

We also introduce the \(L^2\) inner product

\[ \langle u, v \rangle = \int_\Omega u(x) v(x) \]
so that our continuous weak form is to find \(u \in V\) such that
\[ a(v, u) = \langle v, f \rangle, \quad \forall v\in V. \]
Our Galerkin discretization is to find \(u_h \in V_h \subset V\) such that
\[ a(v_h, u_h) = \langle v_h, f \rangle, \quad \forall v_h \in V_h . \]
Since \(V_h \subset V\), we can subtract these two, yielding
\[ a(v_h, u_h - u) = 0, \quad \forall v_h \in V_h .\]
This says that the error in the discrete solution \(u_h - u\) is \(a\)-orthogonal to all test functions \(v_h\).

Galerkin optimality via energy norms#

We can also define the “energy norm” or \(a\)-norm,

\[ \lVert u \rVert_a = \sqrt{a(u,u)} . \]
This norm satisfies the Cauchy-Schwarz inequality,
\[ \lvert a(u,v) \rvert \le \lVert u \rVert_a \lVert v \rVert_a . \]
Now,

(48)#\[\begin{align} \lVert u_h - u \rVert_a^2 &= a(u_h - u, u_h - u) \\ &= a(u_h - v_h, u_h - u) + a(v_h - u, u_h - u) \\ &= a(v_h - u, u_h - u) \\ &\le \lVert v_h - u \rVert_a \lVert u_h - u \rVert_a . \end{align}\]

In other words,

\[\lVert u_h - u \rVert_a \le \lVert v_h - u \rVert_a, \quad \forall v_h \in V_h .\]
So the solution \(u_h\) computed by the Galerkin discretization is optimal over the subspace \(V_h\) as measured in the \(a\)-norm.

Observations#

  • The Galerkin method computes the exact solution any time it resides in the subspace \(V_h\).

  • The Galerkin method is automatically symmetric any time the weak form is symmetric.

  • The Galerkin method can be spectrally accurate, similar to the Chebyshev finite difference methods.

  • For a nonlinear problem, discretization and differentiation will commute.