Incorrect sensitivity analysis using adjoint equation for nonlinear FEA

I am trying to perform a sensitivity analysis on the compliance, C, of an elastic 2D FEA with contact. When solving the sensitivity analysis for the linear FEA without contact, the finite difference shows a good agreement with the derived sensitivity, but when I add the contact portion of the weak form the error is larger than expected and gets worse near the contact surface and better (approaching 1e-6) away from the contact surface. Therefore, it seems like there is a problem with the formulation/math and not the implementation, but I cant figure out from where, any help would be greatly appreciated!

The compliance is a function of displacement, u, which itself is a function of density, p, and I am trying to compute \frac{dC}{d\rho}. The equation for the compliance is included below

C(u(\rho),\rho) = \int \sigma(u(\rho),\rho):\epsilon(u(\rho)) \,d\Omega

To compute the displacement to use in the compliance calculation, the linear elastic PDE along with a penalty approach addition to the weak form to enforce the signorini contact conditions are computed using the following equation

R(u(\rho),\rho) = \int \sigma(u(\rho),\rho):\epsilon(v) \,d\Omega -\int F\cdot v \ d\Gamma_{t} \ +\int g(u(\rho)) \cdot v \ d\Gamma_{c} = 0

Where u is defined using a 1st order continuous Galerkin function space and is a 2D vector, v is the test function in the same function space as u, and p is defined on a 0th order discrete Galerkin function space. \sigma(u(\rho),\rho) and \epsilon(u(\rho)) are the stress and strain respectively, F is the load vector, and g(u(\rho)) is the gap function enforcing the rigid body contact conditions and what makes the problem nonlinear. \Omega is the domain, \Gamma_{t} is the surface the traction is applied on, \Gamma_{c} is the surface the contact is applied on, and there is a displacement u = 0 enforced on \Gamma_{u}.

To perform the sensitivity analysis, I solve the adjoint equation given below, where \lambda is in the same function space as p and \lambda=0 along the displacement boundary \Gamma_{u}

-\frac{\partial C}{\partial u}=\lambda \frac{\partial R}{\partial u}

then compute the following to get the final vector for \frac{dC}{d\rho},

\frac{dC}{d\rho}=\frac{\partial C}{\partial \rho}+\lambda \frac{\partial R}{\partial \rho}

with all values and derivatives being computed using the ufl and dolfinx libraries. While it works perfectly when I dont add the contact penalty term, I am guessing that there is an issue with the derivation specifically in a potential coupling between the surface integral for the contact penalty and the lagrange multiplier since the error is localized around the contact surface (image attached below, the left edge is pinned, the right edge has a downward traction applied, and the holes are where the contact boundary), but I don’t know why? if there is anything unclear please let me know, thank you!

First of all, without seeing the code it is challenging to pinpoint what is wrong here.
My gut feeling is that you need to decrease the tolerances for the nonlinear forward solve (and potentially the subsequent adjoint solve, depending on what solver and solver parameters you use).

(post deleted by author)

Thank you for your quick response! I made some minimum working code and it has some strange results. The error seems to be dependent on how much deformation occurs near the contact area. At large deformation the max relative error for the contact simulation is better than the linear FEA simulation, but in the situations I need to simulate where the contact deformation is not large (since thats when the small deformation assumption is valid), this causes high error spikes at certain elements.

I tried to investigate different parameter effects on the error, but nothing seemed to significantly help in the worst case scenario loading comparison (load = 0.1 in the simulation code below). Decreasing the ksp_rtol from 1e-8 to 1e-14 had minor changes to max error, decreasing quadrature decreases max error and increasing it increases max error, but not significantly, decreasing or increasing step size for finite difference increases max error, and increasing penalty value slightly decreases error at smaller increases, but then increases error. Also, looking at the residual error for both the FEA and adjoint equations, all elements have values below 1e-8.

Currently I am not sure if its caused by incorrect math or potentially the problem is ill posed at smaller contacts? if so, is there any way to fix this? Another potential issue is the memory usage after the simulation ends slowly increases at a rate of around 0.75 GB per 5 minutes, but I am not sure where this is coming from since the PETSc vectors are destroyed at the end of the code. It also happens even when I disable all plotting so its just the dolfinx calculations.

The linear simulation has the displacement at the bottom of the domain pinned at 0, and a load is applied on the top surface. When contact is enabled, a sphere indenter is located at the center of the top surface. I attached the code and pictures of the absolute error and residual at the worse case scenario loading (load = 0.1). I ran it using dolfinx version 0.10.0 installed on docker. any help to solve this or determine if the error is unavoidable would be greatly appreciated!

from dolfinx import mesh, fem, default_scalar_type
from dolfinx.fem.petsc import NonlinearProblem, LinearProblem
from mpi4py import MPI
from petsc4py import PETSc
import ufl
import numpy as np
import dolfinx
from dolfinx.fem import Constant
from dolfinx.fem import form, assemble_scalar, extract_function_spaces
from dolfinx.fem.petsc import create_vector, create_matrix, assemble_vector, assemble_matrix
from dolfinx.fem.petsc import set_bc

# physical variables
N = 25
L = 1.0
D = 2.0*L

enableContactSimulation = False
plotValues = True

# modifying the applied load changes the max error (tested at finite difference step of 1e-6)
# load = default_scalar_type(0.01) # max relative error for contact (4e-4) higher than linear FEA (2.4e-5)
load = default_scalar_type(.1) # max relative error for contact (2.5e-3) much higher than linear FEA (2.5e-5)
# load = default_scalar_type(1) # max relative error for contact (7.5e-6) lower than linear FEA (2.4e-5)

# material properties
E_val = default_scalar_type(200.0)
nu_val = default_scalar_type(0.3)

# values for loading and contact
kpen_val = default_scalar_type(E_val*100)
R = default_scalar_type(.5)
d = default_scalar_type(0.01)

# finite difference step
stepSize = default_scalar_type(1e-5)

# create mesh
comm = MPI.COMM_WORLD
domain = mesh.create_rectangle(
    comm,
    [np.array([-L, -D]), np.array([L, 0])],
    [N, int(round(N*D/L))],
    cell_type=mesh.CellType.quadrilateral,
)

# find the top contact surface
def top_surface(x):
     return np.isclose(x[1],0.0)

# define bottom displacement boundary
def bottom_surface(x):
    return np.isclose(x[1], -D)

# locate all exterior faces (for later boolean operations)
tdim = domain.topology.dim
fdim = tdim - 1
domain.topology.create_connectivity(fdim, tdim)

# locate different regions of facets where the boundary conditions will be applied
top_facets = mesh.locate_entities_boundary(domain, fdim, top_surface)
bottom_facets = mesh.locate_entities_boundary(domain, fdim, bottom_surface)

# define the mesh marker for each facets on the mesh
num_facets = domain.topology.index_map(fdim).size_local
markers = np.zeros(num_facets, dtype=np.int32)
top_mark,bottom_mark = 1,2
markers[top_facets],markers[bottom_facets] = top_mark,bottom_mark
facet_marker = mesh.meshtags(domain, fdim, np.arange(num_facets, dtype=np.int32), markers)

# define the function spaces
V = fem.functionspace(domain, ("CG", 1, (domain.geometry.dim,)))
S0 = fem.functionspace(domain, ("DG", 0))
S = fem.functionspace(domain, ("CG", 1))

# define the test and solution space
v = ufl.TestFunction(V)
du = ufl.TrialFunction(V)
u_field = fem.Function(V)
rho_field = fem.Function(S0)
lmda = fem.Function(V)

# Material interpolation
E0, nu = Constant(domain, default_scalar_type(E_val)), Constant(domain, default_scalar_type(nu_val))
p, eps = Constant(domain, default_scalar_type(3.0)), Constant(domain, default_scalar_type(1e-6))
E = (eps + (1-eps)*rho_field**p) * E0
Dmat = E/(1.0-nu*nu)*ufl.as_matrix([[1.0,nu,0.0],[nu,1.0,0.0],[0.0,0.0,(1.0-nu)]])
rho_field.x.array[:] = 0.5

# define the displacment boundary condition numerically, then assign them to the boundary facets
u_fixedAll = np.array([0, 0], dtype=default_scalar_type)
bc_bottom = fem.dirichletbc(u_fixedAll, fem.locate_dofs_topological(V, fdim, facet_marker.find(bottom_mark)),V)
bcs = [bc_bottom]

# define the traction to be 0
T = fem.Constant(domain, default_scalar_type((0, load)))

def TensorToVec(m):
    return ufl.as_vector([m[0,0],m[1,1],.5*m[0,1]+.5*m[1,0]])

def VecToTensor(vec):
    return ufl.as_matrix([[vec[0],vec[2]],[vec[2],vec[1]]])

# return the strain from the displacement symbolically
def epsilon(z):
    return ufl.sym(ufl.grad(z))

# return the stress from the displacement symbolically
def sigma(z):
    return VecToTensor(ufl.dot(Dmat,TensorToVec(epsilon(z))))

# define the integration measure with traction integral over all facets with value 2 and set quadrature degree to 4
metadata = {"quadrature_degree": 4}
ds = ufl.Measure("ds", domain=domain, subdomain_data=facet_marker)
dx = ufl.Measure("dx", domain=domain, metadata=metadata)

# define the obstacle function
def gap(x,c,u_field,R):
    dist = ufl.sqrt(ufl.dot(c-x,c-x))
    undeformedGap = dist-R
    normal = (c-x)/dist
    deformedgGap = ufl.dot(u_field,normal)-undeformedGap
    cappedGap = ufl.conditional(ufl.le(deformedgGap,0),0,deformedgGap)
    return cappedGap*normal


Kpen = fem.Constant(domain, kpen_val)
centroid = ufl.as_vector([0.0,R-d])
x = ufl.SpatialCoordinate(domain)
obstacle_ufl_expr = gap(x,centroid,u_field,R)

# calculate the residual of the equations
resDeform = ufl.inner(sigma(u_field), epsilon(v)) * dx - ufl.dot(T, v) * ds(top_mark)
resGap = Kpen*ufl.dot(obstacle_ufl_expr,v)*ds(top_mark)
residual = resDeform
if (enableContactSimulation): residual += resGap

# the equation for the compliance
compliance = ufl.inner(sigma(u_field), epsilon(u_field)) * dx

# define tangent tensor
J = ufl.derivative(residual, u_field, du) # for taking the derivative with respect to a function

# setup solver
prefix = f"nonlinear_solver_{id(domain)}_"
jit_options ={"cffi_extra_compile_args":["-O3","-ffast-math"]}
petsc_options = {
        "pc_type": "lu",
        "pc_factor_mat_solver_type": "mumps",
        "ksp_error_if_not_converged": True,
}

problem = NonlinearProblem(residual,u_field,bcs=bcs,petsc_options_prefix=prefix, J=J,jit_options=jit_options,petsc_options=petsc_options)
problem.solve()

# solve the adjoint problem to get sensitivity
compliance_prefix = f"compliance_adjoint_solver_{id(domain)}"

dRdrho = ufl.adjoint(ufl.derivative(residual,rho_field))
dRdU = ufl.adjoint(ufl.derivative(residual,u_field))
dCdrho = ufl.derivative(compliance, rho_field)
negdCdU = default_scalar_type(-1.0)*ufl.derivative(compliance, u_field)

dRdrho_form = form(dRdrho)
dRdrho_mat = create_matrix(dRdrho_form)
dRdU_form = form(dRdU)
dRdU_mat = create_matrix(dRdU_form)
dCdrho_form = form(dCdrho)
dCdrho_vec = create_vector(extract_function_spaces(dCdrho_form),"nest")

residual_adj = ufl.derivative(residual,u_field)*lmda+ufl.derivative(compliance, u_field)
adjointProblem = LinearProblem(a=dRdU, L=negdCdU, bcs=bcs, u=lmda, petsc_options_prefix=compliance_prefix,petsc_options=petsc_options)

adjointProblem.solve()
lmda.x.scatter_forward()

dCdrho_vec.zeroEntries()
dRdrho_mat.zeroEntries()

assemble_vector(dCdrho_vec, dCdrho_form)
assemble_matrix(dRdrho_mat, dRdrho_form)
dRdrho_mat.assemble()

dCdrho_vec += dRdrho_mat*lmda.x.petsc_vec
dCdrho_vec.ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE)

realSens = dCdrho_vec.array


# Calculate sensitivity error using finite difference
C_form = form(compliance)
C_value = comm.allreduce(assemble_scalar(C_form), op=MPI.SUM)

num_elems = rho_field.x.array.shape[0]
finiteDiffCompliance = np.zeros([num_elems,])
realCompliance = np.zeros([num_elems,])
realCompliance[:] = C_value
for e in range(0,num_elems):
    replacedRho = rho_field.x.array[e]
    rho_field.x.array[e] = replacedRho + stepSize

    problem.solve()

    finiteDiffCompliance[e] = comm.allreduce(assemble_scalar(C_form), op=MPI.SUM)
    rho_field.x.array[e] = replacedRho
    rho_field.x.scatter_forward()

finiteDiffSens = (finiteDiffCompliance-realCompliance)/stepSize
absoluteError = finiteDiffSens-realSens
relativeError = np.abs(np.divide(absoluteError,finiteDiffSens))


# CALUCLATE PERFORMANCE METRICS
simType = "linear"
if (enableContactSimulation): simType = "contact"

# calculate the FEA residual for plotting
R_form = form(residual)
R_vec = create_vector(extract_function_spaces(R_form),"nest")
R_field = fem.Function(V)
Rmag_field = fem.Function(S0)
R_vec.zeroEntries()
assemble_vector(R_vec, R_form)
set_bc(R_vec, bcs)
R_field.x.array[:] = R_vec.array[:]
R_value = ufl.sqrt(ufl.dot(R_field,R_field))
R_expr = fem.Expression(R_value, S0.element.interpolation_points)
Rmag_field.interpolate(R_expr)

# calculate the adjoint residual for plotting
R_adj_form = form(residual_adj)
R_adj_vec = create_vector(extract_function_spaces(R_adj_form),"nest")
R_adj_field = fem.Function(V)
Rmag_adj_field = fem.Function(S0)
R_adj_vec.zeroEntries()
assemble_vector(R_adj_vec, R_adj_form)
set_bc(R_adj_vec, bcs)
R_adj_field.x.array[:] = R_adj_vec.array[:]
R_adj_value = ufl.sqrt(ufl.dot(R_adj_field,R_adj_field))
R_adj_expr = fem.Expression(R_adj_value, S0.element.interpolation_points)
Rmag_adj_field.interpolate(R_adj_expr)

print("max error for " + simType + ": " + str(np.max(relativeError)))

# PLOTTING
def plotValue(plotter, grid, value, name="value"):
    plotter.background_color = "white"

    grid.cell_data[name] = np.hstack(value)
    num_elems = int(round(u_field.x.array.shape[0]/2,0))
    displ = np.reshape(u_field.x.array,(num_elems,2))
    displ3D = np.zeros((num_elems,3))
    displ3D[:,[0,1]] = displ
    grid.point_data["U"] = displ3D

    plotter.add_mesh(grid.warp_by_vector('U'), cmap="viridis",show_scalar_bar=True, scalars=name)
    
    plotter.view_xy()
    plotter.screenshot(name+".jpg", window_size=(1000, 1000))
    plotter.clear()

# plot all values
if (plotValues):
    import pyvista
    pyvista.OFF_SCREEN = True
    elements, cell_types, nodes = dolfinx.plot.vtk_mesh(domain, domain.topology.dim)
    grid = pyvista.UnstructuredGrid(elements, cell_types, nodes)
    plotter = pyvista.Plotter()

    plotValue(plotter, grid, relativeError,name=simType+" relative error")
    plotValue(plotter, grid, relativeError,name=simType+" relative error")
    plotValue(plotter, grid, Rmag_field.x.array, name=simType+" FEA residual")
    plotValue(plotter, grid, Rmag_adj_field.x.array, name=simType+" adjoint residual")

    plotter.close()
    plotter.deep_clean()

# cleanup sensitivity vectors
dCdrho_vec.destroy()
dRdrho_mat.destroy()
dRdU_mat.destroy()
# cleanup residual vectors
R_vec.destroy()
R_adj_vec.destroy()

PETSc.garbage_cleanup()