Assemble bilinear form from a product of linear forms

Philipp_Weder · April 14, 2025, 9:48am

I want to solve a PDE, with the following variational formulation

a(u, v) + b_1(u) b_2(v) = L(v) \quad \quad \forall v \in V,

where a(u,v) = \int_\Omega \nabla u \cdot \nabla v dx, and

b_1(u) = \int_{\gamma} u ds\\ b_2(v) = \int_\Omega \ell v dx,

where \ell is some known function and \gamma is a marked interior boundary.

How can one assemble the corresponding matrix in an efficient way?

I know how to assemble the vectors corresponding to b_1 and b_2, but not how to create the matrix for their outer product.

dokken · April 14, 2025, 6:35pm

I would suggest an approach like what is suggested in: https://link.springer.com/content/pdf/10.1007/s00366-025-02107-1.pdf
Below I have made an alternate approach, based of the comments in:
[petsc-users] Outer Product of two vectors

# Usage of PETSc Python-context to assemble to outer product of two vectors matrix free.
# Author: Jørgen S. Dokken
# SPDX-license License-Identifier: MIT
from mpi4py import MPI
from petsc4py import PETSc
import dolfinx.fem.petsc
import ufl
import numpy as np


class CustomOperator:
    def __init__(self, form1, form2):
        self.form1 = dolfinx.fem.form(form1)
        self.form2 = dolfinx.fem.form(form2)
        self.b1 = dolfinx.fem.petsc.create_vector(self.form1)
        self.b2 = dolfinx.fem.petsc.create_vector(self.form2)

        self._assemble_vectors()

    def _assemble_vectors(self):
        with self.b1.localForm() as loc1, self.b2.localForm() as loc2:
            loc1.set(0)
            loc2.set(0)
        dolfinx.fem.petsc.assemble_vector(self.b1, self.form1)
        self.b1.ghostUpdate(PETSc.InsertMode.ADD, PETSc.ScatterMode.REVERSE)
        self.b1.ghostUpdate(PETSc.InsertMode.INSERT, PETSc.ScatterMode.FORWARD)

        dolfinx.fem.petsc.assemble_vector(self.b2, self.form2)
        self.b2.ghostUpdate(PETSc.InsertMode.ADD, PETSc.ScatterMode.REVERSE)
        self.b2.ghostUpdate(PETSc.InsertMode.INSERT, PETSc.ScatterMode.FORWARD)

    def mult(self, mat, X, Y):
        # Reassemble vectors, in case of coefficients or other time dependecy
        self._assemble_vectors()

        # outer(b1, b2) X = dot(b2, X) b1
        val = self.b2.dot(X)
        self.b1.copy(Y)
        Y.scale(val)

    def getDiagonal(self, mat, vec):
        vec.pointwiseMult(self.b1, self.b2)


mesh = dolfinx.mesh.create_unit_square(MPI.COMM_WORLD, 15, 7)

V = dolfinx.fem.functionspace(mesh, ("Lagrange", 2))
u = ufl.TrialFunction(V)
v = ufl.TestFunction(V)

x = ufl.SpatialCoordinate(mesh)
form1 = dolfinx.fem.form(u * ufl.ds)
form2 = dolfinx.fem.form(v * ufl.dx)

assembler = CustomOperator(form1, form2)


sp = dolfinx.cpp.la.SparsityPattern(
    mesh.comm,
    [V.dofmap.index_map, V.dofmap.index_map],
    [V.dofmap.index_map_bs, V.dofmap.index_map_bs],
)
sp.finalize()
A = dolfinx.cpp.la.petsc.create_matrix(mesh.comm, sp, "python")
A.setPythonContext(assembler)

# x = ufl.SpatialCoordinate(mesh)
# f = x[0] * v * ufl.dx

# # Set up matrix free solver
# ksp = PETSc.KSP().create(mesh.comm)
# ksp.setOperators(A)
# ksp.setType("cg")
# ksp.getPC().setType("jacobi")
# ksp.setErrorIfNotConverged(True)

# uh = dolfinx.fem.Function(V)
# b = dolfinx.fem.petsc.assemble_vector(dolfinx.fem.form(f))
# b.ghostUpdate(PETSc.InsertMode.ADD, PETSc.ScatterMode.REVERSE)
# ksp.solve(b, uh.x.petsc_vec)
# print(uh.x.array)


#  Verification
c = dolfinx.fem.petsc.create_vector(dolfinx.fem.form(form1))
with c.localForm() as loc:
    loc.set(1)
d = dolfinx.fem.petsc.create_vector(dolfinx.fem.form(form1))
A.mult(c, d)


A_numpy = np.outer(assembler.b1.array, assembler.b2.array)
d_numpy = np.dot(A_numpy, c.array)
np.testing.assert_allclose(d.array, d_numpy)
print(d.array, d_numpy)

Note that with such an approach, one cannot use a direct solver, as the full dense matrix is newer actually formed.

Philipp_Weder · April 15, 2025, 7:39am

dokken:

class CustomOperator:
    def __init__(self, form1, form2):
        self.form1 = dolfinx.fem.form(form1)
        self.form2 = dolfinx.fem.form(form2)
        self.b1 = dolfinx.fem.petsc.create_vector(self.form1)
        self.b2 = dolfinx.fem.petsc.create_vector(self.form2)

        self._assemble_vectors()

    def _assemble_vectors(self):
        with self.b1.localForm() as loc1, self.b2.localForm() as loc2:
            loc1.set(0)
            loc2.set(0)
        dolfinx.fem.petsc.assemble_vector(self.b1, self.form1)
        self.b1.ghostUpdate(PETSc.InsertMode.ADD, PETSc.ScatterMode.REVERSE)
        self.b1.ghostUpdate(PETSc.InsertMode.INSERT, PETSc.ScatterMode.FORWARD)

        dolfinx.fem.petsc.assemble_vector(self.b2, self.form2)
        self.b2.ghostUpdate(PETSc.InsertMode.ADD, PETSc.ScatterMode.REVERSE)
        self.b2.ghostUpdate(PETSc.InsertMode.INSERT, PETSc.ScatterMode.FORWARD)

    def mult(self, mat, X, Y):
        # Reassemble vectors, in case of coefficients or other time dependecy
        self._assemble_vectors()

        # outer(b1, b2) X = dot(b2, X) b1
        val = self.b2.dot(X)
        self.b1.copy(Y)
        Y.scale(val)

    def getDiagonal(self, mat, vec):
        vec.pointwiseMult(self.b1, self.b2)

OK, thanks! The assembly works fine. However, I now have some follow-up questions:

I still have the problem, that the final matrix is of the form A + B_1 B_2^\top, where now we can assemble B := B_1 B_2^\top. However, how can we do the sum now? I tried A.axpy(1.0, B), but this throws the following error:

    A.axpy(1.0, B)
  File "petsc4py/PETSc/Mat.pyx", line 4174, in petsc4py.PETSc.Mat.axpy
petsc4py.PETSc.Error: error code 56
[0] MatAXPY() at /Users/runner/miniforge3/conda-bld/bld/rattler-build_petsc_1741089228/work/src/mat/utils/axpy.c:133
[0] MatAXPY_BasicWithTypeCompare() at /Users/runner/miniforge3/conda-bld/bld/rattler-build_petsc_1741089228/work/src/mat/utils/axpy.c:77
[0] MatAXPY_Basic() at /Users/runner/miniforge3/conda-bld/bld/rattler-build_petsc_1741089228/work/src/mat/utils/axpy.c:238
[0] MatAXPY_Basic_Preallocate() at /Users/runner/miniforge3/conda-bld/bld/rattler-build_petsc_1741089228/work/src/mat/utils/axpy.c:172
[0] MatGetRow() at /Users/runner/miniforge3/conda-bld/bld/rattler-build_petsc_1741089228/work/src/mat/interface/matrix.c:578
[0] No support for this operation for this object type
[0] No method getrow for Mat of type python

How can I apply Dirichlet BCs to the final system? Does anything change due to the new assembler?
Both vectors B_1 and B_2 are very sparse due to the underlying mathematics. Is there any way to take this into account?

Shenyuan_Ma · April 21, 2025, 3:23pm

Dear Philipp,
My guess for your question 1 about implementing A+BB^T is to use the matrix shell technique again as Dokken did to implement the CustomOperator . At initialization of CustomOperator, you pass your A as an argument and save it into the custom operator and as you implement mult method and getDiagonal method, you modify them slightly to take into account the effect of A. I think the reason why A.axpy(1.0,B) failed for you is because the second part B is not assembled as a csr matrix but is implemented implicitly, it is dense, in fact, if you really assemble every element.
for 2 I think you should do something like apply lifting, https://docs.fenicsproject.org/dolfinx/main/python/generated/dolfinx.fem.petsc.html#dolfinx.fem.petsc.apply_lifting so that you enforce the boundary conditions. But I’m not exactly sure …
for 3 I think you don’t really need to worry because if you solve the system with Krylov space method any way the bottle neck of each iteration will be the multiplication A*x and improving the logic of BB^T with sparsity doesnot seem to help a lot.

Otherwise you can also solve (A+BB^T)x=l without introducing matrix shell as Dokken did, which was very elegant anyway. If you use the [Sherman–Morrison formula - Wikipedia](Sherman-Morrison Formula), the inverse of the matrix A+uv^T is
\begin{equation} A^{-1}-\frac{A^{-1}uv^TA^{-1}}{1+v^TA^{-1}u} \end{equation}
so what you should do is to do two linear solves Ay=u and Ax=l and the solution to your system will be \begin{equation} (A+uv^T)z = l\iff z = x-\frac{v^Tx}{1+v^Ty}y \end{equation}. Doing so you only need to use A which you have already. The price to pay is that now you need to do two linear solves.

I hope this can help.

dokken · April 25, 2025, 12:26pm

I am attaching a work in progress on matrix free assemblers:

gist.github.com

https://gist.github.com/jorgensd/7346be612f5e59e6a6e0920f8c7d2d97

script.py


from mpi4py import MPI
from petsc4py import PETSc
import dolfinx.fem.petsc
import ufl
import numpy as np



class NotQuiteMatrixFreeOperator:

This file has been truncated. show original

which shows how to handle DirichletBCs.

Note that the system is matrix-free, in the sense that A is never assembled.
If you combine this new code with the previous one I supplied, you can add the mat-vec product of each these assembly operations into a single vector.

As I stated in the first post, if you know that it is super-sparse, you could go down the route of:

(look for flame matrix).

Note that the matrix-free approach only stores 2xnum_dofs arrays in memory to represent the system, rather than the dense operator A that is the result of the outer product of b1 and b2.

Topic		Replies	Views
Speed up (block) matrix assembly by merging SparsityPattern of the blocks General	8	764	February 22, 2023
Is it Possible to insert a vector into an assembled matrix? dolfinx dolfinx	13	360	May 4, 2024
How to assemble bilinear and linear form in MixedFunctionSpace? variational formulation	12	2216	February 15, 2022
An assembly about optimizing the bilinear matrix A General	6	68	July 14, 2024
What's the best way to assemble a form? dolfinx	14	2403	June 1, 2022

Assemble bilinear form from a product of linear forms

Related topics