I’m working on a simulation that needs certain mesh refinements to be done before any computations are made, and because of the number of cores required I’m finding it too slow to do these refinements in parallel, so I’m trying to do them in serial, then partition the mesh to benefit from multiprocessing.
Here is the code I’m currently working on. It runs fine with a single core but throws undescribed malloc() errors when I call it with more cores.
comm = MPI.COMM_WORLD
serial_mesh = self.geometry.mesh
rank = comm.rank
if rank == 0:
dim = serial_mesh.topology.dim
cell_name = serial_mesh.ufl_cell().cellname()
coords = serial_mesh.geometry.x
cells = serial_mesh.topology.create_connectivity(dim, 0)
cells = serial_mesh.topology.connectivity(dim, 0).array.reshape(-1, dim + 1)
coords_shape = coords.shape
cells_shape = cells.shape
else:
dim = None
cell_name = None
coords_shape = None
cells_shape = None
# Broadcast metadata
dim = comm.bcast(dim, root=0)
cell_name = comm.bcast(cell_name, root=0)
coords_shape = comm.bcast(coords_shape, root=0)
cells_shape = comm.bcast(cells_shape, root=0)
# Allocate buffers
if rank != 0:
coords = np.empty(coords_shape, dtype=np.float64)
cells = np.empty(cells_shape, dtype=np.int32)
coords = comm.bcast(coords, root=0)
cells = comm.bcast(cells, root=0)
coords = coords[:, :dim]
gdim = coords.shape[1]
# Create vector Lagrange element for coordinates
element = basix.ufl.element("Lagrange", cell_name, 1, shape=(gdim,))
# Wrap into coordinate element
coord_element = ufl.Mesh(element)
# Create parallel mesh from broadcasted data
print(f"{rank}: Cell_Shape {cells.shape}, Coord_Shape {coords.shape}", flush= True)
parallel_mesh = mesh.create_mesh(comm, cells, coords, coord_element)
Here, the mesh assigned to serial_mesh is the fully refined mesh, created with MPI.COMM_SELF on rank 0