How to avoid getting many adjoint equations when using a load stepping solver?

I have a highly nonlinear problem where the rhs depends on the solution, K(u)u=F(u). I am solving this by calling NonlinearVariationalSolver in a loop that recomputes and increments the rhs. For a MWE, I attach a slightly modified version of the Hyperelasticity demo that basically does the same thing. My problem is that, with this approach, each load step iteration is differentiated in the gradient computations. However, I just need the final solution of the nonlinear equation and not the series of nonlinear equations actually solved (apart from terrible performance, this is making it hard for me to debug other issues). In other words, I want solving a nonlinear equation in n amount of load steps to result in only 1 adjoint equation. From my understanding, I will probably have to overload a doflin-adjoint solve function and/or the SolveBlock so that all the load steps are treated as the outcome of one solve, but I am not sure where to start. Thanks for any suggestions in advance!

import matplotlib.pyplot as plt
from dolfin import *
from dolfin_adjoint import *
import numpy as np

parameters["form_compiler"]["cpp_optimize"] = True
parameters["form_compiler"]["representation"] = "uflacs"

mesh = UnitCubeMesh(24, 16, 16)
V = VectorFunctionSpace(mesh, "Lagrange", 1)

left =  CompiledSubDomain("near(x[0], side) && on_boundary", side = 0.0)
right = CompiledSubDomain("near(x[0], side) && on_boundary", side = 1.0)

c = Constant((0.0, 0.0, 0.0))
r = Expression(("scale*0.0",
                "scale*(y0 + (x[1] - y0)*cos(theta) - (x[2] - z0)*sin(theta) - x[1])",
                "scale*(z0 + (x[1] - y0)*sin(theta) + (x[2] - z0)*cos(theta) - x[2])"),
                scale = 0.5, y0 = 0.5, z0 = 0.5, theta = pi/3, degree=2)

bcl = DirichletBC(V, c, left)
bcr = DirichletBC(V, r, right)
bcs = [bcl, bcr]

# Define functions
du = TrialFunction(V)            # Incremental displacement
v  = TestFunction(V)             # Test function
u  = Function(V)                 # Displacement from previous iteration

# Kinematics
I = Identity(3)             # Identity tensor
F = I + grad(u)             # Deformation gradient
C = F.T*F                   # Right Cauchy-Green tensor

# Invariants of deformation tensors
Ic = tr(C)
J  = det(F)

# Elasticity parameters
E, nu = Constant(10.0, name="E"), Constant(0.3)
mu, lmbda = Constant(E/(2*(1 + nu))), Constant(E*nu/((1 + nu)*(1 - 2*nu)))

# Stored strain energy density (compressible neo-Hookean model)
psi = (mu/2)*(Ic - 3) - mu*ln(J) + (lmbda/2)*(ln(J))**2

# Total potential energy
Pi = psi*dx

T  = Constant((0.5,  0.0, 0.0), name="traction")  # Traction force on the boundary
def update_traction(i):  # function to update load vector during solve can be arbitrarly complex
    return Constant(i)*T

tape = get_working_tape()
file = File("displacement.pvd"); file << (u,-1)
for i in np.linspace(0, 1.0, 5):  # 5 load steps
    with tape.name_scope("Timestep"):
        inc_load = update_traction(i)
        Pi_f = dot(inc_load, u)*ds
        F = derivative(Pi - Pi_f, u, v)
        J = derivative(F, u, du)
        problem = NonlinearVariationalProblem(F, u, bcs=bcs, J=J, form_compiler_parameters={"optimize": True})
        solver = NonlinearVariationalSolver(problem)
        solver.parameters.update({"nonlinear_solver":"snes"})
        solver.solve()
        file << (u,i)

tape.visualise()

J = assemble(psi*dx)
Jhat = ReducedFunctional(J, Control(mu))
h = Constant(0.01)
assert(taylor_test(Jhat, mu, h) > 1.9)
assert(taylor_test(Jhat, mu, h, dJdm=0) > 0.9)

I think I figured it out with a much simpler solution, modifying the for loop as follows:

tape = get_working_tape()
pause_annotation()
discr = 5
for step, i in enumerate(np.linspace(0, 1.0, discr)):
    if step==discr -1:
        continue_annotation()
        inc_load = T
    else:
        inc_load = update_traction(i)

    Pi_f = dot(inc_load, u)*ds
    F = derivative(Pi - Pi_f, u, v)
    J = derivative(F, u, du)
    problem = NonlinearVariationalProblem(F, u, bcs=bcs, J=J, form_compiler_parameters={"optimize": True})
    solver = NonlinearVariationalSolver(problem)
    solver.parameters.update({"nonlinear_solver":"snes"})
    solver.solve()

The test passes.

Upon closer examination, I don’t think my above solution actually works. It seems that pausing the annotation in effect hides the solver iterations, which are necessary for evaluating the objective function correctly when performing an optimization.

I believe you implicitly already get what you are after because we do not differentiate with respect to the initial guess of nonlinear solvers. In other words, even though the backpropagation path will look like it is going through all solves, it stops at the first solve because the adjoint output associated with the initial guess will be None.

If you wanted this reflected on the tape, you could add a custom block. I’ve written a simple example for this problem, which can of course be generalized better if it is to be used for other types of problems:

from fenics_adjoint.variational_solver import NonlinearVariationalSolveBlock
from pyadjoint.tape import annotate_tape


class CustomNonlinearSolverBlock(NonlinearVariationalSolveBlock):
    def __init__(self, Pi, u, v, bcs, update_traction, **kwargs):
        # We define a custom block that builds on NonlinearVariationalSolveBlock,
        # since we want to use its adjoint routine but modify its forward solve routine.
        # Hence, we define the form it should adjoint (the last iteration, i = 1.0)
        inc_load = update_traction(1.0)
        Pi_f = dot(inc_load, u) * ds
        F = derivative(Pi - Pi_f, u, v)
        self.update_traction = update_traction
        self.Pi = Pi
        self.v = v
        super().__init__(F == 0, u, bcs, solver_kwargs={}, problem_J=None, solver_params={"nonlinear_solver": "snes"})

    def _forward_solve(self, lhs, rhs, func, bcs, **kwargs):
        # We use replace_form to insert the actual tape values of the form coefficients.
        Pi = self._replace_form(self.Pi, func)
        # Call our original solver 
        # (annotation is in this context turned off automatically)
        forward_solver_routine(Pi, func, self.v, bcs, self.update_traction)
        return func


def custom_nonlin_solver(Pi, u, v, bcs, update_traction, **kwargs):
    # Standard custom function implementation.
    annotate = annotate_tape(kwargs)
    if annotate:
        tape = get_working_tape()
        block = CustomNonlinearSolverBlock(Pi, u, v, bcs, update_traction)
        tape.add_block(block)

    with stop_annotating():
        out = forward_solver_routine(Pi, u, v, bcs, update_traction)

    if annotate:
        block.add_output(u.create_block_variable())

    return out


def forward_solver_routine(Pi, u, v, bcs, update_traction):
    for i in np.linspace(0, 1.0, 5):  # 5 load steps
        inc_load = update_traction(i)
        Pi_f = dot(inc_load, u) * ds
        F = derivative(Pi - Pi_f, u, v)
        J = derivative(F, u, du)
        problem = NonlinearVariationalProblem(F, u, bcs=bcs, J=J, form_compiler_parameters={"optimize": True})
        solver = NonlinearVariationalSolver(problem)
        solver.parameters.update({"nonlinear_solver": "snes"})
        solver.solve()
        file << (u, i)


custom_nonlin_solver(Pi, u, v, bcs, update_traction)

Hi, sorry to dig this up @sebastkm . I’m in the exact same situation as the example described in this topic, I just wanted to ask whether this was still the recommended approach or if something else has been introduced since. I’ve come across the Placeholder alternative, would that suit this problem?
If so, I don’t really understand how to use it here. Should I just use p = Placeholder(u) before the load steps, then only at the final step use p.set_value(u) and compute whatever gradient I want wrt p?
Thanks!