Assembling Global Node Vector from Distributed Mesh in Parallel Computation

saragazzoni · May 9, 2025, 4:55pm

We are running a parallel computation and need to reconstruct the global connectivity matrix and the global node vector. Specifically, we compute a quantity h locally on each node of our mesh within each processor. Now, we need to assemble the global vector (h_global) of this quantity h, ordered according to the global node numbering.

We’ve attached a simplified example of our code.
Do you have any suggestions on how to correctly build the global vectors in the proper order?


import numpy as np
import meshio
import ufl
import basix
import dolfinx
from mpi4py import MPI

def compute_face_area(A,B,C):

    A = np.array(A)
    B = np.array(B)
    C = np.array(C)

    AB = B - A
    AC = C - A
    cross_product = np.cross(AB, AC)
    area = 0.5 * np.linalg.norm(cross_product)
    
    return area


comm = MPI.COMM_WORLD
rank = comm.rank
size = comm.size

msh = dolfinx.mesh.create_unit_cube(MPI.COMM_WORLD, 5, 5, 5)
elem = basix.ufl.element("Lagrange", msh.basix_cell(), 1)
V= dolfinx.fem.functionspace(msh, elem)


num_cells = len(V.dofmap.list)
h_ele_vect = np.zeros(num_cells)
for (ii,k) in enumerate(V.dofmap.list):
    p0 = msh.geometry.x[k[0]]
    p1 = msh.geometry.x[k[1]]
    p2 = msh.geometry.x[k[2]]
    p3 = msh.geometry.x[k[3]]
    area1 = compute_face_area(p0,p1,p2)
    area2 = compute_face_area(p1,p2,p3)
    area3 = compute_face_area(p2,p3,p0)
    area4 = compute_face_area(p3,p0,p1)
    h = np.sum([area1, area2, area3, area4])
    h_ele_vect[ii] = h

num_nodes = len(msh.geometry.x)

# we don't know how to avoid the ghosts nodes in the global vector reconstruction 

ghosts = msh.geometry.index_map().ghosts
h_nodes_vect = np.zeros(num_nodes-len(ghosts))
for ii in range(len(h_nodes_vect)):
    k = np.where(np.array(V.dofmap.list)==ii)
    h_nodes_vect[ii] = np.mean(h_ele_vect[k[0]])


h_global = comm.gather(h_nodes_vect, root=0)
if rank == 0:
    h_global = np.concatenate(h_global)
    print("h_nodes_vect", h_global)

dokken · May 9, 2025, 5:13pm

I’m a bit confused about what you are trying to do.
It seems like you for each cell want to find the accumulated area of all faces of that element.

Is this what you want to do?

If so, I don’t really understand why you are constructing a P1 function space, and are assuming that the degrees of freedom are 1 to 1 with the mesh geometry.

saragazzoni · May 12, 2025, 10:13am

Sorry for the confusion. To clarify, we are working on a mesh adaptation problem. The specific way we compute the function h is not important for our question— in our actual code, h depends on the solution of a PDEs problem.

We didn’t attach the full code because it would have been too long and complex. Instead, we provided a simplified toy example to illustrate the core issue.

Our main question is: How can we assemble a global vector in MPI for a generic function h defined on the mesh nodes, ensuring it is correctly ordered according to the global node numbering across all processors? The challenge is related to the presence of ghost nodes, which are shared among multiple processors, and we want to avoid duplication or inconsistencies when building the global vector.

dokken · May 12, 2025, 12:30pm

You can always just work on the local part of your array, and then call u.x.scatter_forward() to distribute results from the process that owns the data, to those that “ghosts” it. See for instance Parallel Computations with Dolfinx using MPI — MPI tutorial

saragazzoni · May 12, 2025, 1:25pm

Thank you. We need to write the node coordinates, connectivity matrix, and the h vector to a file in order to use them with TetGen, which does not support parallel execution. Our idea is to gather all the data on the first processor and write the file from there.

dokken · May 12, 2025, 2:10pm

Please note that if h comes from a function, it needs to be mapped to the mesh nodes.
See for instance: Moving submesh diverges from its designated trajectory in a two-phase problem - #2 by dokken

It is quite easy to gather the local mesh geometry and topology, as each of them has an index map that can be used to map from local to global indices.

An illustration of this is done with:

from mpi4py import MPI

import dolfinx
import numpy as np

mesh = dolfinx.mesh.create_unit_square(MPI.COMM_WORLD, 10, 10)

geom_imap = mesh.geometry.index_map()

local_range = geom_imap.local_range

num_nodes_local = local_range[1] - local_range[0]

nodes = mesh.geometry.x.reshape(-1, 3)[:num_nodes_local, :]

all_nodes = mesh.comm.gather(nodes, root=0)
if mesh.comm.rank == 0:
    all_nodes = np.vstack(all_nodes)
    print(all_nodes.shape)

num_cells_local = mesh.topology.index_map(mesh.topology.dim).size_local
connectivity = mesh.geometry.dofmap[:num_cells_local, :]
global_connectivity = geom_imap.local_to_global(connectivity.flatten()).reshape(-1, connectivity.shape[1])

all_connectivity = mesh.comm.gather(global_connectivity, root=0)
if mesh.comm.rank == 0:
    all_connectivity = np.vstack(all_connectivity)
    print(all_connectivity.shape)

Topic		Replies	Views
MPI demo in dolfinx	2	877	March 9, 2023
Nodes and connectivity change when creating mesh in dolfinx mesh	1	52	November 4, 2024
How to manually assemble a loading vector element-wise in parallel? Linear Algebra	0	408	April 8, 2022
Assembling the Mass Matrix in Parallel dolfinx	4	76	February 4, 2025
Create mesh from arrays: How to preserve order of vertices? dolfinx	2	34	October 29, 2024

Assembling Global Node Vector from Distributed Mesh in Parallel Computation

Related topics