# 2023-10-18 Krylov and preconditioning#

## Last time#

Classical iterative methods

Concept of preconditioning

## Today#

Krylov methods (focus on GMRES)

PETSc experiments

Simple preconditioners

```
using Plots
default(linewidth=3)
using LinearAlgebra
using SparseArrays
function my_spy(A)
cmax = norm(vec(A), Inf)
s = max(1, ceil(120 / size(A, 1)))
spy(A, marker=(:square, s), c=:diverging_rainbow_bgymr_45_85_c67_n256, clims=(-cmax, cmax))
end
function advdiff_matrix(n; kappa=1, wind=[0, 0])
h = 2 / (n + 1)
rows = Vector{Int64}()
cols = Vector{Int64}()
vals = Vector{Float64}()
idx((i, j),) = (i-1)*n + j
in_domain((i, j),) = 1 <= i <= n && 1 <= j <= n
stencil_advect = [-wind[1], -wind[2], 0, wind[1], wind[2]] / h
stencil_diffuse = [-1, -1, 4, -1, -1] * kappa / h^2
stencil = stencil_advect + stencil_diffuse
for i in 1:n
for j in 1:n
neighbors = [(i-1, j), (i, j-1), (i, j), (i+1, j), (i, j+1)]
mask = in_domain.(neighbors)
append!(rows, idx.(repeat([(i,j)], 5))[mask])
append!(cols, idx.(neighbors)[mask])
append!(vals, stencil[mask])
end
end
sparse(rows, cols, vals)
end
```

```
advdiff_matrix (generic function with 1 method)
```

# Krylov subspaces#

All matrix iterations work with approximations in a *Krylov subspace*, which has the form

This matrix is horribly ill-conditioned and cannot stably be computed as written. Instead, we seek an orthogonal basis \(Q_n\) that spans the same space as \(K_n\). We could write this as a factorization

where the first column \(q_0 = b / \lVert b \rVert\). The \(R_n\) is unnecessary and hopelessly ill-conditioned, so a slightly different procedure is used.

# Arnoldi iteration#

The Arnoldi iteration applies orthogonal similarity transformations to reduce \(A\) to Hessenberg form, starting from a vector \(q_0 = b\),

Let’s multiply on the right by \(Q\) and examine the first \(n\) columns,

where \(H_n\) is an \((n+1) \times n\) Hessenberg matrix.

# Aside: Conditioning of Arnoldi process#

The Arnoldi process is well-conditioned because \(Q\) is orthogonal and

for all \(x\) of norm 1. It must also be true for any \(x = Q_n y\) where \(\lVert y\rVert = 1\), thus

# GMRES#

GMRES (Generalized Minimum Residual) minimizes

## Notes#

The solution \(x_n\) constructed by GMRES at iteration \(n\) is not explicitly available. If a solution is needed, it must be constructed by solving the \((n+1)\times n\) least squares problem and forming the solution as a linear combination of the \(n\) vectors \(Q_n\). The leading cost is \(2mn\) where \(m \gg n\).

The residual vector \(r_n = A x_n - b\) is not explicitly available in GMRES. To compute it, first build the solution \(x_n\) as above.

GMRES minimizes the 2-norm of the residual \(\lVert r_n \rVert\) which is equivalent to the \(A^* A\) norm of the error \(\lVert x_n - x_* \rVert_{A^* A}\).

# More notes on GMRES#

GMRES needs to store the full \(Q_n\), which is unaffordable for large \(n\) (many iterations). The standard solution is to choose a “restart” \(k\) and to discard \(Q_n\) and start over with an initial guess \(x_k\) after each \(k\) iterations. This algorithm is called GMRES(k). PETSc’s default solver is GMRES(30) and the restart can be controlled using the run-time option

`-ksp_gmres_restart`

.Most implementations of GMRES use classical Gram-Schmidt because it is much faster in parallel (one reduction per iteration instead of \(n\) reductions per iteration). The PETSc option

`-ksp_gmres_modifiedgramschmidt`

can be used when you suspect that classical Gram-Schmidt may be causing instability.There is a very similar (and older) algorithm called GCR that maintains \(x_n\) and \(r_n\). This is useful, for example, if a convergence tolerance needs to inspect individual entries. GCR requires \(2n\) vectors instead of \(n\) vectors, and can tolerate a nonlinear preconditioner. FGMRES is a newer algorithm with similar properties to GCR.

# Experiments in PETSc#

PETSc = Portable Extensible Toolkit for Scientific computing

`./configure`

,`make`

Depends on BLAS and LAPACK (default system or package manager)

or

`./configure --download-f2cblaslapack --download-blis`

Depends on MPI for parallelism (package manager)

or

`./configure --download-mpich`

(or`--download-openmpi`

)

`docker run -it --rm jedbrown/petsc`

Lots of examples (mostly C and Fortran, some Python)

Experimental bindings in Rust, Julia

We’ll use

`src/snes/tutorials/ex15.c`

“p-Bratu”: combines p-Laplacian with Bratu nonlinearity

\[ -\nabla\cdot\big(\kappa(\nabla u) \nabla u\big) - \lambda e^u = f \]Newton solver with Krylov on each iteration

# Simple preconditioners#

## Jacobi `-pc_type jacobi`

#

where \(D\) is the diagonal of \(A\).

## Gauss-Seidel `-pc_type sor`

#

where \(L\) is the (strictly) lower triangular part of \(A\). The upper triangular part may be used instead, or a symmetric form

### Over-relaxation#

`-pc_sor_omega 1.`

is default (Gauss-Seidel)

## Run p-Laplacian example#

```
$ cd src/snes/tutorials
$ make ex15
$ ./ex15 -da_refine 2 -dm_view
$ ./ex15 -ksp_monitor -pc_type jacobi
$ ./ex15 -snes_view
```

## Experiments#

How does iteration count vary under grid refinement?

How sensitive is it to parameters

p-Laplacian

`-p`

\(> 1\) and`-epsilon`

\(> 0\)Bratu

`-lambda`

\(< 6.8\)

How sensitive to

`-ksp_gmres_restart`

?`-ksp_monitor_true_residual`

`-ksp_view_eigenvalues`

# Symmetric problems#

## Lanczos iteration: like GMRES for symmetric problems#

If \(A\) is symmetric, then \(H = Q^T A Q\) is also symmetric. Since \(H\) is Hessenberg, this means \(H\) is tridiagonal. Instead of storing \(Q_n\), it is sufficient to store only the last two columns since the iteration satisfies a 3-term recurrence. The analog of GMRES for the symmetric case is called MINRES and is also useful for solving symmetric indefinite problems.

## Conjugate Gradients: changing the norm#

Instead of minimizing the \(A^T A\) norm of the error, the Conjugate Gradient method minimizes the \(A\) norm of the error. For \(A\) to induce a norm, it must be symmetric positive definite. Jeremy Shewchuck’s guide to CG is an excellent resource.