Error with mpi in transient simulation

nikoleeaa · May 28, 2025, 7:02pm

I am also working on a transient simulation where I implement adaptive mesh refinement (AMR) every few time steps. The workflow roughly is:

Define function space, functions, and variational problems on an initial mesh.
Solve to obtain the solution, use it for AMR marking and generate a refined mesh.
On the new mesh, redefine function space, functions, and variational problems.
Repeat AMR refinement until a stopping criterion is met.
Pass the final refined mesh to the transient solver for subsequent time steps.

This loop of AMR is called multiple times throughout the transient simulation. However, when I call the AMR routine too frequently (e.g., more than a certain threshold), I get a similar MPI error:

Other MPI error, error stack:
internal_Dist_graph_create_adjacent(125): MPI_Dist_graph_create_adjacent(comm=0xc4005134, indegree=6, ...)
MPIR_Dist_graph_create_adjacent_impl(319): 
MPII_Comm_copy(913)......................: 
MPIR_Get_contextid_sparse_group(587).....:  Cannot allocate context ID because of fragmentation

After searching related discussions, I found similar issues reported here:

https://fenicsproject.discourse.group/t/saving-meshes-in-a-list-runtimeerror-error-duplication-of-mpi-communicator-failed/7749/8
https://github.com/FEniCS/dolfinx/issues/2308

From these, I tentatively understand the issue is caused by repeatedly calling AMR and creating many functions.

In my case, after each AMR step, I only need the refined mesh to continue the transient simulation; the function spaces and functions from previous AMR iterations are no longer used. However, it seems these resources are not correctly freed.

I have considered reducing the frequency of AMR calls to avoid hitting this problem, but this only postpones the error rather than solving it.

I am not sure if my surface understanding of the issue is correct. If there are better solutions or suggestions on how to handle this problem, I would be very grateful.

Topic		Replies	Views
Duplication of MPI communicator error	1	773	November 8, 2020
Strange problem about generating the mesh in DOLFINX in parallel mesh	2	796	November 30, 2021
Instability of mesh read with HPC I/O	17	467	February 26, 2024
Fnics Solid Mechanics MPI Comm error	1	320	July 19, 2021
Mpi communication with discontinuous Galerkin method for diffusion General	3	527	November 16, 2022

Error with mpi in transient simulation

Related topics