2023-10-18 Krylov and preconditioning#
Last time#
Classical iterative methods
Concept of preconditioning
Today#
Krylov methods (focus on GMRES)
PETSc experiments
Simple preconditioners
using Plots
default(linewidth=3)
using LinearAlgebra
using SparseArrays
function my_spy(A)
cmax = norm(vec(A), Inf)
s = max(1, ceil(120 / size(A, 1)))
spy(A, marker=(:square, s), c=:diverging_rainbow_bgymr_45_85_c67_n256, clims=(-cmax, cmax))
end
function advdiff_matrix(n; kappa=1, wind=[0, 0])
h = 2 / (n + 1)
rows = Vector{Int64}()
cols = Vector{Int64}()
vals = Vector{Float64}()
idx((i, j),) = (i-1)*n + j
in_domain((i, j),) = 1 <= i <= n && 1 <= j <= n
stencil_advect = [-wind[1], -wind[2], 0, wind[1], wind[2]] / h
stencil_diffuse = [-1, -1, 4, -1, -1] * kappa / h^2
stencil = stencil_advect + stencil_diffuse
for i in 1:n
for j in 1:n
neighbors = [(i-1, j), (i, j-1), (i, j), (i+1, j), (i, j+1)]
mask = in_domain.(neighbors)
append!(rows, idx.(repeat([(i,j)], 5))[mask])
append!(cols, idx.(neighbors)[mask])
append!(vals, stencil[mask])
end
end
sparse(rows, cols, vals)
end
advdiff_matrix (generic function with 1 method)
Krylov subspaces#
All matrix iterations work with approximations in a Krylov subspace, which has the form
This matrix is horribly ill-conditioned and cannot stably be computed as written. Instead, we seek an orthogonal basis \(Q_n\) that spans the same space as \(K_n\). We could write this as a factorization
where the first column \(q_0 = b / \lVert b \rVert\). The \(R_n\) is unnecessary and hopelessly ill-conditioned, so a slightly different procedure is used.
Arnoldi iteration#
The Arnoldi iteration applies orthogonal similarity transformations to reduce \(A\) to Hessenberg form, starting from a vector \(q_0 = b\),
Let’s multiply on the right by \(Q\) and examine the first \(n\) columns,
where \(H_n\) is an \((n+1) \times n\) Hessenberg matrix.
Aside: Conditioning of Arnoldi process#
The Arnoldi process is well-conditioned because \(Q\) is orthogonal and
for all \(x\) of norm 1. It must also be true for any \(x = Q_n y\) where \(\lVert y\rVert = 1\), thus
GMRES#
GMRES (Generalized Minimum Residual) minimizes
Notes#
The solution \(x_n\) constructed by GMRES at iteration \(n\) is not explicitly available. If a solution is needed, it must be constructed by solving the \((n+1)\times n\) least squares problem and forming the solution as a linear combination of the \(n\) vectors \(Q_n\). The leading cost is \(2mn\) where \(m \gg n\).
The residual vector \(r_n = A x_n - b\) is not explicitly available in GMRES. To compute it, first build the solution \(x_n\) as above.
GMRES minimizes the 2-norm of the residual \(\lVert r_n \rVert\) which is equivalent to the \(A^* A\) norm of the error \(\lVert x_n - x_* \rVert_{A^* A}\).
More notes on GMRES#
GMRES needs to store the full \(Q_n\), which is unaffordable for large \(n\) (many iterations). The standard solution is to choose a “restart” \(k\) and to discard \(Q_n\) and start over with an initial guess \(x_k\) after each \(k\) iterations. This algorithm is called GMRES(k). PETSc’s default solver is GMRES(30) and the restart can be controlled using the run-time option
-ksp_gmres_restart
.Most implementations of GMRES use classical Gram-Schmidt because it is much faster in parallel (one reduction per iteration instead of \(n\) reductions per iteration). The PETSc option
-ksp_gmres_modifiedgramschmidt
can be used when you suspect that classical Gram-Schmidt may be causing instability.There is a very similar (and older) algorithm called GCR that maintains \(x_n\) and \(r_n\). This is useful, for example, if a convergence tolerance needs to inspect individual entries. GCR requires \(2n\) vectors instead of \(n\) vectors, and can tolerate a nonlinear preconditioner. FGMRES is a newer algorithm with similar properties to GCR.
Experiments in PETSc#
PETSc = Portable Extensible Toolkit for Scientific computing
./configure
,make
Depends on BLAS and LAPACK (default system or package manager)
or
./configure --download-f2cblaslapack --download-blis
Depends on MPI for parallelism (package manager)
or
./configure --download-mpich
(or--download-openmpi
)
docker run -it --rm jedbrown/petsc
Lots of examples (mostly C and Fortran, some Python)
Experimental bindings in Rust, Julia
We’ll use
src/snes/tutorials/ex15.c
“p-Bratu”: combines p-Laplacian with Bratu nonlinearity
\[ -\nabla\cdot\big(\kappa(\nabla u) \nabla u\big) - \lambda e^u = f \]Newton solver with Krylov on each iteration
Simple preconditioners#
Jacobi -pc_type jacobi
#
where \(D\) is the diagonal of \(A\).
Gauss-Seidel -pc_type sor
#
where \(L\) is the (strictly) lower triangular part of \(A\). The upper triangular part may be used instead, or a symmetric form
Over-relaxation#
-pc_sor_omega 1.
is default (Gauss-Seidel)
Run p-Laplacian example#
$ cd src/snes/tutorials
$ make ex15
$ ./ex15 -da_refine 2 -dm_view
$ ./ex15 -ksp_monitor -pc_type jacobi
$ ./ex15 -snes_view
Experiments#
How does iteration count vary under grid refinement?
How sensitive is it to parameters
p-Laplacian
-p
\(> 1\) and-epsilon
\(> 0\)Bratu
-lambda
\(< 6.8\)
How sensitive to
-ksp_gmres_restart
?-ksp_monitor_true_residual
-ksp_view_eigenvalues
Symmetric problems#
Lanczos iteration: like GMRES for symmetric problems#
If \(A\) is symmetric, then \(H = Q^T A Q\) is also symmetric. Since \(H\) is Hessenberg, this means \(H\) is tridiagonal. Instead of storing \(Q_n\), it is sufficient to store only the last two columns since the iteration satisfies a 3-term recurrence. The analog of GMRES for the symmetric case is called MINRES and is also useful for solving symmetric indefinite problems.
Conjugate Gradients: changing the norm#
Instead of minimizing the \(A^T A\) norm of the error, the Conjugate Gradient method minimizes the \(A\) norm of the error. For \(A\) to induce a norm, it must be symmetric positive definite. Jeremy Shewchuck’s guide to CG is an excellent resource.