ghostUpdate for added assembled vectors

Stein · December 2, 2025, 10:27am

I’m building my load vector from various sources, assembled independently, and lifting/applying BC’s subsequently. To get this to work in parallel, ghost valued need to be updated appropriately, but I’m failing to see the logic behind when to do this.

I’ve taken the Poisson demo to illustrate the issue:

from mpi4py import MPI
from petsc4py.PETSc import ScalarType  # type: ignore
from petsc4py import PETSc
import numpy as np
import ufl
from dolfinx import fem, mesh
from dolfinx.fem.petsc import assemble_matrix, assemble_vector, apply_lifting, set_bc

# Function space and mesh
msh = mesh.create_unit_square(MPI.COMM_WORLD, 16, 16)
V = fem.functionspace(msh, ("Lagrange", 1))

# Boundary conditions
marker=lambda x: np.isclose(x[0], 0.0) | np.isclose(x[0], 2.0)
facets = mesh.locate_entities_boundary(msh,dim=(msh.topology.dim - 1),marker=marker)
dofs = fem.locate_dofs_topological(V=V, entity_dim=1, entities=facets)
bc = fem.dirichletbc(value=ScalarType(0), dofs=dofs, V=V)

# Variational problem
u = ufl.TrialFunction(V)
v = ufl.TestFunction(V)
x = ufl.SpatialCoordinate(msh)
f = 10 * ufl.exp(-((x[0] - 0.5) ** 2 + (x[1] - 0.5) ** 2) / 0.02)
g = ufl.sin(5 * x[0])
a = ufl.inner(ufl.grad(u), ufl.grad(v)) * ufl.dx
L0 = ufl.inner(f, v) * ufl.dx
L1 = ufl.inner(g, v) * ufl.ds

# Build ksp solver
A = assemble_matrix(fem.form(a), bcs=[bc])
A.assemble()
ksp = PETSc.KSP().create(MPI.COMM_WORLD)
ksp.setOperators(A)
opts = PETSc.Options()
opts["ksp_type"] = "preonly"
opts["pc_type"] = "lu"
opts["pc_factor_mat_solver_type"] = "mumps"
ksp.setFromOptions()

# Assemble RHS
b0 = assemble_vector(fem.form(L0))
# b0.ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE) ## GHOST B0
b1 = assemble_vector(fem.form(L1))
# b1.ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE) ## GHOST B1
b = b0
b.axpy(1.0, b1)
b.ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE) ## GHOST B-prelift

# b = assemble_vector(fem.form(L0 + L1)) # THIS DOES WORK
apply_lifting(b, [fem.form(a)], bcs=[[bc]])
b.ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE) ## GHOST B-postlift
set_bc(b, [bc])

# Solve problem
uh = fem.Function(V)
ksp.solve(b, x = uh.x.petsc_vec)
uh.x.scatter_forward()

checksum = msh.comm.allreduce(fem.assemble_scalar(fem.form(uh*uh* ufl.dx)), op=MPI.SUM)
if MPI.COMM_WORLD.rank == 0:
    print(checksum)  # Should be invariant to mpirun -n [nprocs]
    assert np.isclose(checksum, 0.16199852713788038), "Result does not match mpirun -n 1 reference value"

Cases for which this does work:

mpirun -n 1 (duh)
By defining b = assemble_vector(fem.form(L0 + L1)) and only updating ghosts after lifting. This is the ‘conventional’ way.
By updating the ghost values only on b1 (not b0) and updating the ghosts of b after applying the lifting (i.e., by uncommenting the line ## GHOST B1). This does not make sense to me: Why updating twice? Why only b1?

Cases for which this confusingly does not work:

Only updating the ghost values of b and only after lifting. This is what I would have expected to work. The process-local b contributions are all finished, and only then do we cross-communicate. Why does this not work?
Updating the ghost values on b0 and updating the ghosts of b after applying the lifting (i.e., by uncommenting the line ## GHOST B0) → Why does it work for updating only b1 but not for updating only b0?
Any other combination of ghost updating.

Stein · December 2, 2025, 12:47pm

Things get quite weird with mixed problems. For Taylor-Hood like nested matrices, the above appears to be roughly consistent. But for a RT/DG pair, I don’t need any ghostUpdating…?

See script below, toggling between CASE=1 and CASE=2

from mpi4py import MPI
from petsc4py import PETSc
import numpy as np
import ufl
from dolfinx import fem, mesh
from dolfinx.fem.petsc import assemble_matrix, assemble_vector
import dolfinx
from basix.ufl import element
from dolfinx.la.petsc import create_vector_wrap

dtype = dolfinx.default_scalar_type
xdtype = dolfinx.default_real_type

CASE = 2 # 1: Taylor-Hood, 2: RT / DG

# Function space and mesh
msh = mesh.create_unit_square(MPI.COMM_WORLD, 32, 32,mesh.CellType.quadrilateral)
k = 2
if CASE == 1:
    Ve = element("Lagrange", msh.basix_cell(), k, shape=(msh.geometry.dim,))
    Qe = element("Lagrange", msh.basix_cell(), k-1)
else:
    Ve = element("RT", msh.basix_cell(), k)
    Qe = element("Discontinuous Lagrange", msh.basix_cell(), k-1)
V = fem.functionspace(msh, Ve)
W = fem.functionspace(msh, Qe)
Q = ufl.MixedFunctionSpace(V, W)

# Variational problem
sigma, u = ufl.TrialFunctions(Q)
tau, v = ufl.TestFunctions(Q)
x = ufl.SpatialCoordinate(msh)
f = 10 * ufl.exp(-((x[0] - 0.5) ** 2 + (x[1] - 0.5) ** 2) / 0.02)
g = ufl.sin(5 * x[0])
a = [[ufl.inner(sigma, tau) * ufl.dx + ufl.inner(ufl.grad(sigma), ufl.grad(tau)) * ufl.dx , ufl.inner(u, ufl.div(tau)) * ufl.dx ], [ufl.inner(ufl.div(sigma), v) * ufl.dx, None]]
L = [ufl.ZeroBaseForm((tau,)), ufl.inner(f, v) * ufl.dx]
L = [ufl.ZeroBaseForm((tau,)), ufl.inner(f, v) * ufl.dx]

# Build ksp solver
A = assemble_matrix(fem.form(a), kind="nest", bcs=[])
A.assemble()
ksp = PETSc.KSP().create(MPI.COMM_WORLD)
ksp.setOperators(A)
opts = PETSc.Options()
opts["ksp_type"] = "preonly"
opts["pc_type"] = "lu"
opts["pc_factor_mat_solver_type"] = "mumps"
ksp.setFromOptions()

# Assemble RHS
b0 = assemble_vector(fem.form(L), kind="nest")
### THIS IS NOT NEEDED FOR RT/DG BUT NEEDED FOR TAYLOR-HOOD
# for b_sub in b0.getNestSubVecs():
#     b_sub.ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE)
b1 = assemble_vector(fem.form(L), kind="nest")
### THIS IS NOT NEEDED FOR RT/DG BUT NEEDED FOR TAYLOR-HOOD
# for b_sub in b1.getNestSubVecs():
#     b_sub.ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE)
b = b0
b.axpy(1, b1)

checksum_b = b.norm()
if MPI.COMM_WORLD.rank == 0:
    print(checksum_b)
    checksum_b_n1 = 0.10988664576189694 if CASE == 1 else 0.05523947839520888
    assert np.isclose(checksum_b, checksum_b_n1, atol=1E-14), "b vector does not match mpirun -n 1 reference value"

sigh = fem.Function(V)
uh = fem.Function(W)
sol_vec = PETSc.Vec().createNest([create_vector_wrap(sigh.x),create_vector_wrap(uh.x)])
ksp.solve(b, x = sol_vec)

### THIS IS NOT NEEDED FOR RT/DG BUT NEEDED FOR TAYLOR-HOOD
# nrmu = msh.comm.allreduce(fem.assemble_scalar(fem.form(uh*uh* ufl.dx)), op=MPI.SUM)
# uh.x.scatter_forward()
# nrmu2 = msh.comm.allreduce(fem.assemble_scalar(fem.form(uh*uh* ufl.dx)), op=MPI.SUM)
# if MPI.COMM_WORLD.rank == 0:
#     print("Vector norms before and after scatter_forward:", nrmu, nrmu2)

checksum_uh = msh.comm.allreduce(fem.assemble_scalar(fem.form(uh*uh* ufl.dx)), op=MPI.SUM)
if MPI.COMM_WORLD.rank == 0:
    print(checksum_uh)  # Should be invariant to mpirun -n [nprocs]
    checksum_uh_n1 = 11.790596222773235 if CASE == 1 else 3.660723376309132
    assert np.isclose(checksum_uh, checksum_uh_n1, atol=1E-14), "uh result does not match mpirun -n 1 reference value"

nate · December 2, 2025, 5:28pm

It’s tricky to follow what you’re doing with all the cases you’ve presented, but for the RT/DG case, the right hand side is just an inner product with your DG test function? Therefore you’re computing cell integrals only testing against v which belongs to a functionspace whose basis is defined cellwise. There’s no information to be shared. If, for example, your right hand side had some kind of facet integral, then you’d have to update ghost information as you’d have contributions from cells neighbouring a process boundary (in addition to using something like GhostMode.shared_facet to make sure that the mesh is ghosted appropriately).

In your Taylor-Hood case, (some of) the DoFs of both the pressure and velocity function spaces are shared between neighbouring cells. So you need to update contributions to those DoFs from integrations performed on cells neighbouring a process boundary.

Stein · December 2, 2025, 6:56pm

Ah! Right That makes a lot of sense. Thanks @nate .

Any thoughts on the question of the first post?

dokken · December 2, 2025, 7:42pm

Let’s start with some PETSc/MPI stuff.
The array in a PETSc-vector can be accessed in multiple ways, with and without ghosts:

from mpi4py import MPI
import dolfinx

mesh = dolfinx.mesh.create_unit_square(MPI.COMM_WORLD, 10, 10)

V = dolfinx.fem.functionspace(mesh, ("Lagrange", 1))
vec = dolfinx.fem.Function(V)
vec.interpolate(lambda x: x[0] + x[1])

print(mesh.comm.rank, "Local size", len(vec.x.petsc_vec.array), flush=True)
with vec.x.petsc_vec.localForm() as loc:
    print(mesh.comm.rank, "Ghosted array", len(loc.array), flush=True)

Running on two processes yields:

0 Local size 57
0 Ghosted array 75
1 Local size 64
1 Ghosted array 75

Thus, for a smaller problem (where we can easily inspect the arrays, lets consider some operations:

from mpi4py import MPI
import dolfinx.fem.petsc
import ufl

N = 3
mesh = dolfinx.mesh.create_unit_square(MPI.COMM_WORLD, N, N)

V = dolfinx.fem.functionspace(mesh, ("Lagrange", 1))

x = ufl.SpatialCoordinate(mesh)
f = 1 + x[0] + x[1] ** 2
L0 = dolfinx.fem.form(f * ufl.TestFunction(V) * ufl.dx)
L1 = dolfinx.fem.form(ufl.inner(ufl.grad(f), ufl.grad(ufl.TestFunction(V))) * ufl.dx)


vec0 = dolfinx.fem.Function(V)
vec1 = dolfinx.fem.Function(V)

dolfinx.fem.petsc.assemble_vector(vec0.x.petsc_vec, L0)
dolfinx.fem.petsc.assemble_vector(vec1.x.petsc_vec, L1)

loc0_size = vec0.x.petsc_vec.getLocalSize()
loc1_size = vec1.x.petsc_vec.getLocalSize()

with vec0.x.petsc_vec.localForm() as loc0:
    vec0_old = loc0.array.copy()

with vec0.x.petsc_vec.localForm() as loc0:
    print("L0:", loc0.array[:loc0_size], loc0.array[loc0_size:])
with vec1.x.petsc_vec.localForm() as loc1:
    print("L1:", loc1.array[:loc1_size], loc1.array[loc1_size:])


vec0.x.petsc_vec.axpy(1.0, vec1.x.petsc_vec)
with vec0.x.petsc_vec.localForm() as loc0:
    print("L0 post axpy:", loc0.array[:loc0_size], loc0.array[loc0_size:])
    diff = vec0_old - loc0.array
    print(diff[:loc0_size], diff[loc0_size:])

as you can see, axpy doesn’t update ghost values, thus those should be accumulated prior to axpy.

L0: [0.09670782 0.03569959 0.1100823  0.19958848 0.05483539 0.12654321
 0.05853909 0.09804527 0.04804527] [0.16265432 0.         0.        ]
L1: [-0.11111111  0.12962963  0.22222222 -0.22222222 -0.24074074  0.22222222
 -0.25925926  0.42592593  0.12962963] [-0.2962963  0.         0.       ]
L0 post axpy: [-0.01440329  0.16532922  0.33230453 -0.02263374 -0.18590535  0.34876543
 -0.20072016  0.52397119  0.1776749 ] [0.16265432 0.         0.        ]
[ 0.11111111 -0.12962963 -0.22222222  0.22222222  0.24074074 -0.22222222
  0.25925926 -0.42592593 -0.12962963] [0. 0. 0.]
L0: [0.04248971 0.07098765 0.19958848 0.09156379 0.07397119 0.11522634
 0.03569959] [0.02335391 0.10401235 0.08569959 0.         0.        ]
L1: [-0.24074074 -0.44444444 -0.22222222 -0.44444444  0.07407407  0.55555556
  0.12962963] [0.12962963 0.03703704 0.42592593 0.         0.        ]
L0 post axpy: [-0.19825103 -0.37345679 -0.02263374 -0.35288066  0.14804527  0.67078189
  0.16532922] [0.02335391 0.10401235 0.08569959 0.         0.        ]
[ 0.24074074  0.44444444  0.22222222  0.44444444 -0.07407407 -0.55555556
 -0.12962963] [0. 0. 0. 0. 0.]

I would probably use third vector to do the lifting and accumulation to get this right.

Stein · December 2, 2025, 7:53pm

Aah that’s the crux. Great, thanks, I can sleep easy now.

That’s indeed what I ended up doing without understanding why.

dajuno · December 11, 2025, 5:52pm

axpy doesn’t update ghost values, but why not simply use axpy within the localForm context and accumulate only once?

b0 = assemble_vector(fem.form(L0))
b1 = assemble_vector(fem.form(L1))

with b0.localForm() as lf0, b1.localForm() as lf1:
    lf0.axpy(1.0, lf1)

apply_lifting(b0, [fem.form(a)], bcs=[[bc]])
b0.ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE)
set_bc(b0, [bc])

Stein · December 12, 2025, 7:07am

Ah, that sounds like a very neat solution. Then ghostupdate only has to be called once, irrespective of how many vectors are added together..?

dajuno · December 12, 2025, 1:54pm

Yes. I’m not aware of any caveats.

Except that if you’re adding the assembled vectors to a vector that has updated its ghost values (i.e., ghostUpdate(addv=PETSc.InsertMode.INSERT, mode=PETSc.ScatterMode.FORWARD) was called on it), these contributions will erroneously be accumulated on the owned values by ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE).

btw, here’s the petsc docs on the local form: Vectors and Parallel Data — PETSc v3.24.2-217-g2906e86ba216 documentation

Topic		Replies	Views
Questions for Solving time-dependent problem General	6	786	February 14, 2023
The usage of the functions of dolfinx dolfinx mpi	4	1306	January 5, 2024
Question about fenics.Function.vector().apply() Legacy DOLFIN	2	67	July 3, 2024
Applying Dirichlet BCs to a matrix-vector product RHS General	1	305	July 19, 2023
Mesh partitioning affecting problem solving (scatter_forward() not working properly?) dolfinx	2	130	August 25, 2024

ghostUpdate for added assembled vectors

Related topics