How to create and use partitioning in an hdf5 file with HDF5File


I am using the HDF5File function to import a mesh for a parallelized code (HDF5File — FEniCS Project). In the the .read function (.read(x, dataset_name, use_partition_from_file)), there is a “use_partition_from_file” argument. I am assuming if you set this argument to “True”, Fenics will use the loaded partition.

My question is regarding how to use this functionality. Is there an example that can be provided? I am creating and writing my mesh to the hdf5 file in serial with Fenics. Is there an additional input to utilize the partitioning function?

Thank you!

The functionality is meant for problems where you have already saved your mesh with HDF5File in parallel, and next time you run it, want to use the same partioning (to save time). Thus, this means that if you create your mesh in serial, there is no partitioning.

You can observe this with the following minimal example:

from dolfin import *
import h5py

filename = "mesh.h5"

mesh0 = UnitSquareMesh(20, 20)
mesh_file = HDF5File(mesh0.mpi_comm(), filename, "w")
mesh_file.write(mesh0, "/my_mesh")

# Read from file
mesh1 = Mesh()
mesh_file = HDF5File(mesh0.mpi_comm(), filename, "r"), "/my_mesh", True)

from mpi4py import MPI
if MPI.COMM_WORLD.rank == 0:
    import h5py
    infile = h5py.File(filename, "r")
    mesh = infile["/my_mesh"]
    partition = mesh["topology"].attrs["partition"]

if you run this for one process, you obtain

(800, 3)

and three processes:

[  0 265 536]
(800, 3)

This means that the first rank owns cells [0, 265), rank two [265, 536), rank three [536, 800).
If you want to add such a partitioning scheme to your mesh created in serial, you can use h5py to say what range of the cell topology belongs to each process

That makes sense; thank you very much for the explanation!

Follow-up question: is there a way to let the ranks own non-consecutively numbered cells? I have a problem where the mesh is composed of many circles in close proximity to each other. I am able to build the mesh in series, and mark the cells that are 1.) in the circles, 2.) on the rim of the circles, and 3.) outside the circles. After importing the mesh from a HDF5 file in parallel, I use MeshView to create submeshes 1.) in the circles, 2.) on the rim of the circles, and 3.) outside the circles.

I have pinpointed the problem to be the following: when there are many circles, HDF5 automatically partitions the mesh such that a rank will obtain only 1 node in the rim submesh (which has yet to be defined in the process: 1) import mesh in parallel, 2) define submeshes on each rank); thus, the rank cannot create a full “rim cell” and terminates the code.

To avoid this, I was thinking I could guide the partitioning such that it does not intersect the circles, but I will need to partition the mesh such that ranks own non-consecutively numbered cells. Do you think this is a good way to go about this, or is there an easier way?

Thank you very much in advanced for any additional information!

What I would suggest to do is to renumber the cells in your domain such that the input data in the HDF5File has the cells of the inner circle first, and then the cells for the second circle etc. This requires some work using either MeshView or some way to identify how to order the cells, but it should not be impossible.