2021-09-20 Nonlinear solvers¶
Last time¶
Upwinding for robustness to (cell) Péclet number and oscillations
Effective use of sparse/structured matrices
Today¶
Intro to rootfinding and Newton’s method
Bratu and p-Laplacian problems
using Plots
using LinearAlgebra
using SparseArrays
function vander(x, k=nothing)
if k === nothing
k = length(x)
end
V = ones(length(x), k)
for j = 2:k
V[:, j] = V[:, j-1] .* x
end
V
end
function fdstencil(source, target, k)
"kth derivative stencil from source to target"
x = source .- target
V = vander(x)
rhs = zero(x)'
rhs[k+1] = factorial(k)
rhs / V
end
function advdiff_sparse(n, kappa, wind, forcing)
x = LinRange(-1, 1, n)
xstag = (x[1:end-1] + x[2:end]) / 2
rhs = forcing.(x)
kappa_stag = kappa.(xstag)
rows = [1, n]
cols = [1, n]
vals = [1., 1.] # diagonals entries (float)
rhs[[1,n]] .= 0 # boundary condition
for i in 2:n-1
flux_L = -kappa_stag[i-1] * fdstencil(x[i-1:i], xstag[i-1], 1) +
wind * (wind > 0 ? [1 0] : [0 1])
flux_R = -kappa_stag[i] * fdstencil(x[i:i+1], xstag[i], 1) +
wind * (wind > 0 ? [1 0] : [0 1])
weights = fdstencil(xstag[i-1:i], x[i], 1)
append!(rows, [i,i,i])
append!(cols, i-1:i+1)
append!(vals, weights[1] * [flux_L..., 0] + weights[2] * [0, flux_R...])
end
L = sparse(rows, cols, vals)
x, L, rhs
end
advdiff_sparse (generic function with 1 method)
Nonlinear problems and rootfinding¶
We want to solve problems like
For linear problems, we found \(u\) such that \(L u = b\) using linear algebra.
For nonlinear problems, we’ll need to find \(u\) such that \(F(u) = 0\).
An algorithm: Bisection¶
Bisection is a rootfinding technique that starts with an interval \([a,b]\) containing a root and does not require derivatives. Suppose \(f\) is continuous. What is a sufficient condition for \(f\) to have a root on \([a,b]\)?
hasroot(f, a, b) = f(a) * f(b) < 0
f(x) = cos(x) - x
plot([f, x->0], xlims=(-4,4))
hasroot(f, 0, 1)
true
Bisection¶
function bisect(f, a, b, tol)
mid = (a + b) / 2
if abs(b - a) < tol
return mid
elseif hasroot(f, a, mid)
return bisect(f, a, mid, tol)
else
return bisect(f, mid, b, tol)
end
end
x0 = bisect(f, -1, 3, 1e-5)
x0, f(x0)
(0.7390861511230469, -1.7035832658995886e-6)
How fast does it converge?¶
function bisect_hist(f, a, b, tol)
mid = (a + b) / 2
if abs(b - a) < tol
return [mid]
elseif hasroot(f, a, mid)
return prepend!(bisect_hist(f, a, mid, tol), [mid])
else
return prepend!(bisect_hist(f, mid, b, tol), [mid])
end
end
bisect_hist (generic function with 1 method)
bisect_hist(f, -1, 3, 1e-4)
17-element Vector{Float64}:
1.0
0.0
0.5
0.75
0.625
0.6875
0.71875
0.734375
0.7421875
0.73828125
0.740234375
0.7392578125
0.73876953125
0.739013671875
0.7391357421875
0.73907470703125
0.739105224609375
Let’s plot the error¶
where \(r\) is the true root, \(f(r) = 0\).
r = bisect(f, -1, 3, 1e-14) # what are we trusting?
hist = bisect_hist(f, -1, 3, 1e-10)
scatter( abs.(hist .- r), yscale=:log10)
ks = 1:length(hist)
plot!(ks, 10 * 0.5 .^ ks)
Evidently the error \(e_k = x_k - x_*\) after \(k\) bisections satisfies the bound
Convergence classes¶
A convergent rootfinding algorithm produces a sequence of approximations \(x_k\) such that
Bisection: A = q-linearly convergent, B = r-linearly convergent, C = neither¶
Remarks on bisection¶
Specifying an interval is often inconvenient
An interval in which the function changes sign guarantees convergence (robustness)
No derivative information is required
Roots of even degree are problematic
A bound on the solution error is directly available
The convergence rate is modest – one iteration per bit of accuracy
No good generalization to higher dimensions
Newton-Raphson Method¶
Much of numerical analysis reduces to Taylor series, the approximation
In numerical computation, it is exceedingly rare to look beyond the first-order approximation
An implementation¶
function newton(f, fp, x0; tol=1e-8, verbose=false)
x = x0
for k in 1:100 # max number of iterations
fx = f(x)
fpx = fp(x)
if verbose
println("[$k] x=$x f(x)=$fx f'(x)=$fpx")
end
if abs(fx) < tol
return x, fx, k
end
x = x - fx / fpx
end
end
eps = 1e-1
newton(x -> eps*(cos(x) - x), x -> eps*(-sin(x) - 1), 1; tol=1e-15, verbose=true)
[1] x=1 f(x)=-0.04596976941318603 f'(x)=-0.18414709848078967
[2] x=0.7503638678402439 f(x)=-0.0018923073822117442 f'(x)=-0.16819049529414878
[3] x=0.7391128909113617 f(x)=-4.645589899077152e-6 f'(x)=-0.16736325442243014
[4] x=0.739085133385284 f(x)=-2.8472058044570758e-11 f'(x)=-0.16736120293089507
[5] x=0.7390851332151607 f(x)=0.0 f'(x)=-0.1673612029183215
(0.7390851332151607, 0.0, 5)
That’s really fast!¶
10 digits of accuracy in 4 iterations.
How is this convergence test different from the one we used for bisection?
How can this break down?
newton(x -> cos(x) - x, x -> -sin(x) - 1, .2*pi/2; verbose=true)
[1] x=0.3141592653589793 f(x)=0.6368972509361742 f'(x)=-1.3090169943749475
[2] x=0.8007054703914749 f(x)=-0.10450500785604744 f'(x)=-1.7178474183038182
[3] x=0.7398706100060081 f(x)=-0.0013148113494647617 f'(x)=-1.6741923555416478
[4] x=0.7390852693340639 f(x)=-2.2781024078266654e-7 f'(x)=-1.6736121297866664
[5] x=0.7390851332151648 f(x)=-6.8833827526759706e-15 f'(x)=-1.673612029183218
(0.7390851332151648, -6.8833827526759706e-15, 5)
Convergence of fixed-point iteration¶
Consider the iteration
Taking absolute values,
Exercise¶
Write Newton’s method for \(f(x) = 0\) from initial guess \(x_0\) as a fixed point method.
Suppose the Newton iterates \(x_k\) converge to a simple root \(x_*\), \(x_k \to x_*\). What is \(\lvert g'(x_*) \rvert\) for Newton’s method?
Newton for systems of equations¶
Let \(\mathbf u \in \mathbb R^n\) and consider the function \(\mathbf f(\mathbf u) \in \mathbb R^n\). Then
We drop all but the first two terms and name the Jacobian matrix \(J(\mathbf u) = \mathbf f'(\mathbf u)\),
Newton in code¶
function newton(residual, jacobian, u0)
u = u0
uhist = [copy(u)]
normhist = []
for k in 1:20
f = residual(u)
push!(normhist, norm(f))
J = jacobian(u)
delta_u = - J \ f
u += delta_u
push!(uhist, copy(u))
end
uhist, normhist
end
newton (generic function with 1 method)
function residual(u)
x, y = u
[x^2 + y^2 - 1, x^2 - y]
end
function jacobian(u)
x, y = u
[2x 2y; 2x -1]
end
uhist, normhist = newton(residual, jacobian, [0.1, 2])
plot(normhist, marker=:auto, yscale=:log10)
Plotting the trajectory¶
xy = hcat(uhist...)
plot(xy[1,:], xy[2,:], marker=:auto)
circle = exp.(1im*LinRange(0, 2*pi, 50))
plot!(real(circle), imag(circle))
plot!(x -> x^2, xlims=(-2, 3), ylims=(-2, 2), axes=:equal, legend=:bottomright)
Bratu problem¶
function bratu_f(u; lambda=.5)
n = length(u)
h = 2 / (n - 1)
weights = -fdstencil([-h, 0, h], 0, 2)
u = copy(u)
f = copy(u)
u[1] = 0
u[n] = 1
f[n] -= 1
for i in 2:n-1
f[i] = weights * u[i-1:i+1] - lambda * exp(u[i])
end
f
end
bratu_f (generic function with 1 method)
function bratu_J(u; lambda=.5)
n = length(u)
h = 2 / (n - 1)
weights = -fdstencil([-h, 0, h], 0, 2)
rows = [1, n]
cols = [1, n]
vals = [1., 1.] # diagonals entries (float)
for i in 2:n-1
append!(rows, [i,i,i])
append!(cols, i-1:i+1)
append!(vals, weights + [0 -lambda*exp(u[i]) 0])
end
sparse(rows, cols, vals)
end
bratu_J (generic function with 1 method)
Solving Bratu¶
n = 20
x = collect(LinRange(-1., 1, n))
u0 = (1 .+ x) ./ 2
uhist, normhist = newton(bratu_f, bratu_J, u0);
plot(normhist, marker=:auto, yscale=:log10)
plot(x, uhist[end], marker=:auto)