Adaptively refining a LinearVariationalProblem inside a loop results in MPI running out of communicators

FEniCS version: 2019.1.0.r3

System: Docker on Windows

I am trying to adaptively mesh my domain based on custom-defined markers inside a loop. Here I am presenting a simplified version of my code. The code is working fine and is adaptively refining the domain and underlying LinearVariationalProblem.

from dolfin import *
parameters["refinement_algorithm"] = "plaza_with_parent_facets"

cpp_code = """
    #include<pybind11/pybind11.h>
    #include<dolfin/adaptivity/adapt.h>
    #include<dolfin/mesh/Mesh.h>
    #include<dolfin/mesh/MeshFunction.h>
    #include<dolfin/fem/DirichletBC.h>
    #include<dolfin/function/FunctionSpace.h>
    #include<dolfin/fem/Form.h>
    #include <dolfin/fem/LinearVariationalProblem.h>
    #include <dolfin/fem/NonlinearVariationalProblem.h>
    #include <dolfin/refinement/LocalMeshCoarsening.h>


    namespace py = pybind11;
    using namespace dolfin;


    PYBIND11_MODULE(SIGNATURE, m) { 
        m.def("adapt", [](const dolfin::LinearVariationalProblem& problem,
                std::shared_ptr<const dolfin::Mesh> adapted_mesh)
        {return dolfin::adapt(problem, adapted_mesh);});
        
        m.def("adapt", [](const dolfin::NonlinearVariationalProblem& problem,
                std::shared_ptr<const dolfin::Mesh> adapted_mesh)
        {return dolfin::adapt(problem, adapted_mesh);});
        
    }
    """

mm = compile_cpp_code(cpp_code)

def adapt_problem(f,mesh):
    return mm.adapt(f, mesh)
  
# Create mesh and define function space
mesh = UnitSquareMesh(2, 2)
V = FunctionSpace(mesh, "Lagrange", 1)
# Define boundary condition
u0 = Function(V)
bc = DirichletBC(V, u0, "x[0] < DOLFIN_EPS || x[0] > 1.0 - DOLFIN_EPS")

# Define variational problem
u = TrialFunction(V)
v = TestFunction(V)
f = Expression("10*exp(-(pow(x[0] - 0.5, 2) + pow(x[1] - 0.5, 2)) / 0.02)",
               degree=1)
g = Expression("sin(5*x[0])", degree=1)
a = inner(grad(u), grad(v))*dx()
L = f*v*dx() + g*v*ds()

# Define function for the solution
u = Function(V)

problem = LinearVariationalProblem(a, L, u, bc)
solver = LinearVariationalSolver(problem)

for i in range(4000):
    #mesh = refine(mesh)
    problem = adapt_problem(problem,mesh)
    solver = LinearVariationalSolver(problem)
    solver.solve()
    print("Step number: ", i)

The above code results in the following error after 2038 steps.

RuntimeError: *** Error: Duplication of MPI communicator failed (MPI_Comm_dup

The above is just a test case to reproduce the error. From the following two posts I have understood two points:

  • MPICH during transient simulation: MPICH limits to 2048 communicators per process on systems.
  • Crash when using project function: Some methods creates a new function space (which in turn uses 1 MPI communicator). Normally DOLFIN will free these communicators when the subfunctions are freed from memory. This is not happening in my case.

I can not use the in-built AdaptiveLinearVariationalSolver as I need more control over adaptivity for my simulations. How can I use the adapt method available in C++ to adapt the LinearVariationalProblem without running into this error?

Someone feel free to correct me, but I’m pretty sure the goal oriented adapt in legacy/old DOLFIN isn’t supported in parallel. It was written circa 2011 and never really updated.

Given the wealth of methods for automated mesh adaptivity, I recommend you write and implement your own (as many others have). One may employ DOLFIN’s refine function for parallel mesh refinement (or otherwise with external mesh generators).

For example you can write your own goal oriented dual-weighted residual error estimator in fewer than 150 lines of python, and it should work fine in parallel.

Here’s another example with fenics-shells using goal oriented adaptivity.

3 Likes

Thank you soo much for the suggestions @nate. I have already written my own adaptivity code. With the above, I was just trying to use the available adapt method in C++ to reduce the length of my code. One thing that I forgot to mention in my question was that I am not using mpirun with the above code. I am getting this error with serial run (python3 main.py).

1 Like

Ah, then indeed this is an odd error.

If we use the refine function to refine the mesh based on some markers, we need to redefine the problem based on that mesh. All of this is OK if we have conditional boundary conditions. But, I am importing my boundary conditions from Gmsh as MeshFunction. This requires me to adapt the DirichletBC or at least the MeshFunction to the new mesh. Also, I am working with a transient problem which again requires me to adapt the Function representing the solution to the refined mesh for use in the next time step.

From what I understand the adapt methods create a new function space (which in turn uses 1 MPI communicator). The successive calls to adapt in the loop keeps on creating new communicators without freeing the old ones. This results in the error.

Is there some way to correct this error in serial? Can anyone please point me in the right direction?