Bug in XDMF mesh read-in / create_connectivity function?

Dear community,

I have a 3D mesh of a heart domain and added closure tetrahedral elements at the openings, see the XDMF file (on OneDrive, only 628 nodes but too large to paste due to line limit, unfortunately I couldn’t reproduce the error with a smaller mesh):

https://1drv.ms/u/s!Av9sdVNHKsLOgXkS6g8ygk9r7AZz?e=CEZYJu

I’ve checked the mesh back and forth and it looks good. Problem now occurs when reading the mesh, and afterwards creating a connectivity prior to reading subsequent boundary information.

The following MWE fails when executing on 2 cores:

#!/usr/bin/env python3
from dolfinx.io import XDMFFile
from mpi4py import MPI

comm = MPI.COMM_WORLD

with XDMFFile(comm, 'lid.xdmf', 'r', encoding=XDMFFile.Encoding.ASCII) as infile:
    mesh = infile.read_mesh(name="Grid")
    mt_d = infile.read_meshtags(mesh, name="Grid")

# prior to reading a boundary topology:
mesh.topology.create_connectivity(2, mesh.topology.dim)

yielding a double free or corruption (out) with dolfinx Docker image from 25 Feb 2021.

Executing on one core works fine, and when removing the lids (so all elements with Attribute #2) there is also no problem. The error comes from the create_connectivity command.

Can anyone execute this MWE on more than one core without the error? Does anybody have an idea what happens? I know that these lid elements are only connected to the rest via edge, and not face. But that should in principle not be problematic (?).

I’d greatly appreciate any help with this.

Best,
Marc

I can reproduce your segfault with the dolfinx build on Debian unstable. But only with 2 cores (mpirun -n 2). All other core counts, including -n 1 and -n 3 through to -n 8, appear to succeed, no segfault.

Thanks for trying this! Indeed, other core numbers than 2 work on my end, too. I do have a similar setting with a bigger mesh with multiple of these “edge-based” connections of different regions which fails for cores > 1.
I’ve also tried removing one of the two lid domains in the example presented - and having just one or the other works regardless of # of cores.

I can also reproduce on 2 cores. It raises an assertion in the Debug build, so probably there is an assumption about the mesh that is not being met. Thanks for highlighting this…

1 Like

Thanks for checking on this! What does the assertion error say, is there something that can be changed with respect to the mesh, i.e. node ordering etc.? So is it a bug or an intended assertion?
Thx, Marc

It might be worth trying ghost_mode=cpp.mesh.GhostMode.none when reading the mesh. Unless you are using DG, it is not really needed - I think the error is happening due to a bug in ghost cells for your mesh.

Indeed, that did the job! No read-in error anymore regardless of number of cores! :slight_smile: Thanks!

@marchirschvogel - can I copy your test mesh to the FEniCS GitHub and include in an “Issue”? We will try to fix it then.

@chris Sure, feel free to use the mesh!