AMG for elasticity problems leaks with PETSc AMG smoothers

marchirschvogel · March 16, 2024, 2:03pm

Dear community,

I want to solve a nonlinear problem using GMRES preconditioned with BoomerAMG. My setup experiences a strong memory leak when using a smoother, e.g. ParaSails. I am not sure whether this is related to the way I’ve set things up or if it’s an issue inside PETSc.
Consider the following hyperelasticity MWE (run with latest dolfinx release) of a clamped block elongated by a Neumann traction:

#!/usr/bin/env python3
import sys
import numpy as np
from mpi4py import MPI
from petsc4py import PETSc
from dolfinx import fem, mesh
import dolfinx.fem.petsc
import ufl

comm = MPI.COMM_WORLD

msh = mesh.create_box(comm, [np.array([0.0, 0.0, 0.0]),np.array([2.0, 1.0, 1.0])], [5, 5, 5],
               mesh.CellType.tetrahedron)

dx_ = ufl.Measure("dx", domain=msh)

# function space for displacement
V_u = fem.functionspace(msh, ("Lagrange", 1, (msh.geometry.dim,)))

# collect dofs and meshtags
dofs_left = fem.locate_dofs_geometrical(V_u, lambda x: np.isclose(x[0], 0.))

# select boundaries (right, bottom, top)
boundaries = [(1, lambda x: np.isclose(x[0], 2.)), (2, lambda x: np.isclose(x[1], 0.)), (3, lambda x: np.isclose(x[1], 1.))]
facet_indices, facet_markers = [], []
fdim = msh.topology.dim - 1
for (marker, locator) in boundaries:
    facets =  mesh.locate_entities(msh, fdim, locator)
    facet_indices.append(facets)
    facet_markers.append(np.full_like(facets, marker))
facet_indices = np.hstack(facet_indices).astype(np.int32)
facet_markers = np.hstack(facet_markers).astype(np.int32)
sorted_facets = np.argsort(facet_indices)
facet_tag = mesh.meshtags(msh, fdim, facet_indices[sorted_facets], facet_markers[sorted_facets])

ds_ = ufl.Measure("ds", domain=msh, subdomain_data=facet_tag)

# reference normal on mesh facets
n0 = ufl.FacetNormal(msh)

# define functions
du    = ufl.TrialFunction(V_u)            # Incremental displacement
var_u = ufl.TestFunction(V_u)             # Test function
u     = fem.Function(V_u, name='Displacement')
# disp increment for Newton
del_u = fem.Function(V_u)

# load
p = fem.Constant(msh, 0.0)

# Boundary conditions
zero = fem.Function(V_u)
zero.vector.set(0.0)
dbcs = [ fem.dirichletbc(zero, dofs_left) ]

F = ufl.variable(ufl.Identity(len(u)) + ufl.grad(u)) # deformation gradient
C = F.T*F                                            # right Cauchy-Green tensor

mu, nu = 10., 0.3 # shear modulus and Poisson's ratio
beta = nu/(1.-2.*nu)
Psi = (mu/2.) * (ufl.tr(C) - msh.geometry.dim) + mu/(2.*beta)*(ufl.det(F)**(-2.*beta) - 1.) # strain energy density (compressible Neo-Hooke)

P = ufl.diff(Psi,F) # 1st Piola-Kirchhoff stress

# internal virtual work, plus Robin terms
k=1e3
n_o_n = ufl.outer(n0,n0)
dW_int = ufl.inner(P,ufl.grad(var_u))*dx_ + k*ufl.dot(n_o_n*u, var_u)*ds_(2) + k*ufl.dot(n_o_n*u, var_u)*ds_(3)

# external virtual work
dW_ext = p*ufl.det(F)*ufl.dot(ufl.dot(ufl.inv(F).T,n0), var_u)*ds_(1)

# nonlinear variational form: internal minus external virtual work
varform = dW_int - dW_ext

# Jacobian of nonlinear variational form
jac = ufl.derivative(varform, u, du)

r_form, jac_form = fem.form(varform), fem.form(jac)
r = fem.petsc.create_vector(r_form)
K = fem.petsc.create_matrix(jac_form)
# create solver
ksp = PETSc.KSP().create(comm)

ksp.setType("gmres")
ksp.getPC().setType("hypre")
ksp.getPC().setHYPREType("boomeramg")
opt_dict = {'pc_hypre_boomeramg_strong_threshold':0.7,
            'pc_hypre_boomeramg_smooth_type':'parasails'} # Schwarz-smoothers, parasails, pilut, euclid 
opts = PETSc.Options()
for o in opt_dict:
    opts.setValue(o, opt_dict[o])
ksp.getPC().setFromOptions()

T       = 1.0
nstep   = 10000
dt      = T/nstep

# load/time stepping
interval = np.linspace(0, T, nstep+1)

dp = 0.01
for (N, dt) in enumerate(np.diff(interval)):
    
    t = interval[N+1]

    p.value += dp

    # Newton loop
    tolres = 1.0e-8
    tolinc = 1.0e-0
    it = 0
    maxiter = 25

    # print header
    if comm.rank == 0:
        print("iter  res 2-norm      incr 2-norm")
        sys.stdout.flush()

    # assemble initial rhs vector
    with r.localForm() as r_local: r_local.set(0.0)
    fem.petsc.assemble_vector(r, r_form)
    fem.petsc.apply_lifting(r, [jac_form], [dbcs], x0=[u.vector], scale=-1.0)
    r.ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE)
    fem.petsc.set_bc(r, dbcs, x0=u.vector, scale=-1.0)
    
    # get initial residual norm
    struct_res_norm = r.norm()

    if comm.rank == 0:
        print("%i     %.4e         " % (it, struct_res_norm))
        sys.stdout.flush()
        
    it += 1

    while it < maxiter:

        # assemble system matrix
        K.zeroEntries()
        fem.petsc.assemble_matrix(K, jac_form, dbcs)
        K.assemble()

        # solve linear system
        ksp.setOperators(K)
        ksp.solve(-r, del_u.vector)
        
        # get increment norm
        struct_inc_norm = del_u.vector.norm()
        
        # update solution
        u.vector.axpy(1.0, del_u.vector)
        u.vector.ghostUpdate(addv=PETSc.InsertMode.INSERT, mode=PETSc.ScatterMode.FORWARD)

        # assemble updated rhs vector
        with r.localForm() as r_local: r_local.set(0.0)
        fem.petsc.assemble_vector(r, r_form)
        fem.petsc.apply_lifting(r, [jac_form], [dbcs], x0=[u.vector], scale=-1.0)
        r.ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE)
        fem.petsc.set_bc(r, dbcs, x0=u.vector, scale=-1.0)
        
        # get residual norm
        struct_res_norm = r.norm()
        
        if comm.rank == 0:
            print("%i     %.4e      %.4e" % (it, struct_res_norm, struct_inc_norm))
            sys.stdout.flush()
        
        # increase iteration index
        it += 1

        # check if converged
        if struct_res_norm <= tolres and struct_inc_norm <= tolinc:
            break

Apologies for the length of the MWE, but I needed to run through the nonlinear solver loop to have the leak be obvious.

Removing l. 93 (ksp.getPC().setFromOptions()) seems to cure the problem: It runs and does not allocate unneeded memory. Setting the options, however, let’s it either fail due to a leak in step 2, or accumulates memory in case pc_hypre_boomeramg_strong_threshold is set to the default (0.25).

Has anyone worked with these PETSc smoothers, e.g. are there additional settings to take care of? Or would it be something rather to be reported in a PETSc forum, e.g. non-FEniCS related?

P.S. Setting a nullspace to the matrix like reported here, dolfinx/python/demo/demo_elasticity.py at main · FEniCS/dolfinx · GitHub, does not change the issue.

I’d appreciate any help.

Best,
Marc

hherlyng · March 19, 2024, 3:07pm

Hi Marc,

I’m sorry I can’t help on the memory leakage issue, I don’t have any experience with that. But it sounds like it could be worth giving the PETSc forum a shot.

Additionally, without knowing whether setting the near-nullspace would resolve your issue, I just wanted to point out (if you weren’t already aware) that hypre ignores the near-nullspace you provide it with, unless you also set some other options. See PCHYPRE — PETSc 3.20.5 documentation for more information on this.

Cheers,
Halvor

marchirschvogel · March 26, 2024, 9:25am

Dear Halvor,

Thanks for your reply. After some research, it indeed seems that Hypre does not actively support Parasails or other more complex smoothers anymore, cf. ParaSails — hypre 2.31.0 documentation.
Recommended options would be to use for instance ILU smoothers, cf. ILU — hypre 2.31.0 documentation, however I am not sure how these can be set through petsc4py.
The only options for pc_hypre_boomeramg_smooth_type to set seem to be
schwarz-smoothers, pilut, parasails, euclid.
So if anyone has experience using the Hypre in-built ILU smoothers for AMG through petsc4py, any help is highly appreciated.

Thanks,
Marc

marchirschvogel · April 12, 2024, 1:32pm

I wanna give an update on this: When setting a value for -pc_hypre_boomeramg_smooth_num_levels, hence the first n levels where smoothing should be applied, to a value significantly lower than the max AMG levels, the leak seems to be gone. Since both -pc_hypre_boomeramg_max_levels and -pc_hypre_boomeramg_smooth_num_levels default to 25, the issue appears by default.

Topic		Replies	Views
Configure petsc solver options General	5	500	September 17, 2024
Testing weak scalability of parallel solver: cg + hypre_amg	8	2017	August 10, 2020
Boomeramg in dolfinx dolfinx	4	939	March 31, 2022
Poisson problem with Neumann boundary condition variational formulation	12	961	September 2, 2024
Precondition appears not to happen dolfinx linear-algebra	4	184	November 13, 2024

AMG for elasticity problems leaks with PETSc AMG smoothers

Related topics