Write and read back mesh in parallel with same meshtags

Hello,

I recently moved from dolfinx 0.6.0 to dolfinx 0.7.3, by installing adios4dolfinx ADIOS4DOLFINx - A framework for checkpointing in DOLFINx — ADIOS2Wrappers via conda.

In the following example, I create a mesh via GMSH and define cell and surface tags.
I then import everything in dolfinx by using io.gmshio.model_to_mesh.
Finally, I save mesh, cell tags and facet tags in a XDMF file.
I run the script in parallel, for instance by using mpirun -n 5.

My problem:
I would like to read back the same mesh and meshtags, distributed in the same fashion among the 5 ranks. It worked out of the box with dolfinx 0.6.0, but it does not seem to be the case with dolfinx 0.7.3.
In the following example, the boundary dofs distributed among the 5 ranks are different when:

  • I save the mesh and then compute boundary_dofs (save = True)

  • I load mesh and meshtags from the XDMF file and then compute the boundary_dofs (save = False)

I tried to do a similar thing by using adios4dolfinx but I run into the same issue.
Could you please help me?

Many thanks

import gmsh
from dolfinx import io, fem
import os, shutil, sys
import petsc4py
from mpi4py import MPI

petsc4py.init(sys.argv)
comm = MPI.COMM_WORLD

dim = 3
volume_tag = 1
surface_tag = 10
save = True
test_dir = os.path.join(os.getcwd(),'test')

if save:
    # Save mesh and meshtags
    if comm.rank == 0:
        print('Saving ...', flush = True)
    if os.path.isdir(test_dir):
        shutil.rmtree(test_dir)
    os.makedirs(test_dir)

    # Create mesh
    gmsh.initialize()
    gmsh.option.setNumber("General.Terminal", 0)
    model = gmsh.model()
    model.add("Domain")
    model.occ.addBox(0, 0, 0, 8, 8, 8)
    model.occ.synchronize()
    model.addPhysicalGroup(dim, [1], volume_tag)
    model.setPhysicalName(dim, volume_tag, "Domain volume")
    boundary_dimtags = model.getBoundary([(dim, 1)], oriented=False)
    model.addPhysicalGroup(dim-1, [dt[1] for dt in boundary_dimtags], tag=surface_tag)
    model.setPhysicalName(dim-1, surface_tag, "Domain surface")
    gmsh.option.setNumber("Mesh.CharacteristicLengthMax", 8/13)
    model.occ.synchronize()
    model.mesh.generate(dim)
    domain, cell_tags, facet_tags = io.gmshio.model_to_mesh(model, comm, rank=0, gdim=dim)

    with io.XDMFFile(comm, test_dir+'/domain.xdmf', "w") as xdmf:
        xdmf.write_mesh(domain)
        cell_tags.name = "cell_tags"
        facet_tags.name = "facet_tags"
        xdmf.write_meshtags(cell_tags, domain.geometry)
        xdmf.write_meshtags(facet_tags, domain.geometry)
        xdmf.close()
else:
    # Load mesh and meshtags
    if comm.rank == 0:
        print('Loading ...', flush = True)
    with io.XDMFFile(comm, test_dir+'/domain.xdmf', "r") as xdmf:
        domain = xdmf.read_mesh()
        tdim = domain.topology.dim
        fdim = tdim - 1
        domain.topology.create_connectivity(fdim, tdim)
        cell_tags = xdmf.read_meshtags(domain, name = "cell_tags")
        facet_tags = xdmf.read_meshtags(domain, name =  "facet_tags")
        assert cell_tags.name == "cell_tags"
        assert facet_tags.name == "facet_tags" 

# Compute boundary_dofs
Vp = fem.FunctionSpace(domain,("CG", 1))
boundary_dofs = fem.locate_dofs_topological(Vp, domain.topology.dim-1, facet_tags.find(surface_tag))
# boundary_dofs is different when the mesh is computed versus loaded
print('Rank = {}, self.boundary_dofs = {}\n\n'.format(comm.rank, boundary_dofs))

For the main branch if adios4dolfinx there is support for storing data in the original order, see
http://jsdokken.com/adios4dolfinx/docs/partitioned_mesh.html
and/or
http://jsdokken.com/adios4dolfinx/docs/original_checkpoint.html
Depending on the application.

Thanks Dokken,

however, the function adios4dolfinx.write_mesh does not seem to have the keyword store_partition_info

import adios4dolfinx as a4dx
a4dx.write_mesh(domain, test_dir+'/domain.bp', store_partition_info=True)
TypeError: write_mesh() got an unexpected keyword argument 'store_partition_info

The function write_mesh in checkpointing.py is defined as

write_mesh(mesh: dolfinx.mesh.Mesh, filename: Path, engine: str = "BP4")

This, with adios4dolfinx installed via conda (v0.7.3).
When I try to install the main branch I get:

INFO: pip is looking at multiple versions of adios4dolfinx to determine which version is compatible with other requirements. This could take a while.
ERROR: Could not find a version that satisfies the requirement fenics-dolfinx>=0.8.0.dev0 (from adios4dolfinx) (from versions: none)
ERROR: No matching distribution found for fenics-dolfinx>=0.8.0.dev0

It is on the main branch of adios4dolfinx.
You would need to use the main branch of DOLFINx as well, as I do not plan to backport these functionalities to 0.7.x

Thank you Dokken,

I am having troubles installing dolfinx 0.8.0 from source, due to conflicting dependencies.
When do you think that adios4dolfinx+dolfinx0.8.0 will be available in conda?

Meanwhile, is there any way to achieve the same by playing with the XDMF files, as reported in the mwe above?

thanks

I would just use the docker image from: Package dolfinx/dolfinx · GitHub

I would never assume that the boundary dofs are the same across runs. This is a very fragile thing to consider, as the distribution of ghosts might be different depending on the MPI communication. Could you add an argument as to why you would need this? As the function above compute the dofs, there shouldn’t be a need to assume the exact same order for another run.

thank you for your answer.

During a first run, I create a mesh and compute a boundary element matrix with a given number of processors. Calculations are long so I would like to restart the simulation by using checkpoints.
In the second run, I would like to simply load the BEM matrix, which requires each processor to have the same DoFs of the first run.
This was working just fine in dolfinx 0.6.0, did something change in dolfinx 0.7.3 ?

I installed docker by using pip.
When I try to pull the main branch of dolfinx, I get the error message:

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

I think that was just working by chance, it would not be something we plan on supporting, as the distributions of dofs is up to the partitioner (Parmetis, Scotch or Kahip), and they are not always consistent across runs. Secondly, the ghost ordering might change from run to run, as they are assigned based on the current communicator, which I do not think is the same from run to run (as i dont think the ghost ordering in the index map is the same from time to time)

oh, I see. Strange that it was working no matter the number of procs I used though. Good to know!

Would adios4dolfinx.write_mesh() with store_partition_info=True solve my problem or can you already foresee problems ahead with ghost nodes and so on? Would it be better to compute the bem matrix at every run?

I think you would get problems with the ordering of ghosts (even if the cells are on the same process).

I would recompute the bem matrix at each run.

thank you for your replies! Have a nice evening

Hi Dokken,

could you please explain to me what is the advantage of using adios4dolfinx.write_mesh() with store_partition_info=True if this does not ensure saving the mesh and all the DoFs (including ghosts) consistently between runs? What does it exactly do and what is its intended purpose?

Also, is there any documentation on how dolfinx distributes DoFs (including ghosts) among processors at every run?

There is a variety of applications where one would like to load a mesh and apply some initial conditions to the exact same nodes, without interpolating the entire field at each time (with BEM it’s not even possible), so it seems strange to me that this is not a possibility.

thanks again

If you want to avoid repartitioning of the mesh (as this can be somewhat expensive).

If I would have to enforce the DoF ordering etc it requires storage of alot more data.

You would have to read the source code:

and probably also the topology creation code.

It is unclear to me how you want to store this data. As far as I can tell from your code, you have the underlying assumption of a P1-space, where degrees of freedom are associated with the vertices. In such a case it is possible to build a map from the global input vertices to the degrees of freedom on your process. However, note that this is not possible for most spaces.