FFCX compiler files size limit with petsc.assemble_matrix

mariom121 · April 15, 2026, 9:16am

Dear Fenics community: 'm building a fenicsx / dolfinx code for a physical equation built with sympy, which is very intrincate. The physics of the equation are fine but I need to emulate spectral methods and thus I need to use a high polynomial degree (I then solve an eigenvalue problem via SLEPc). My problem is that for a degree above a high enough number (let’s call it p), my ffcx compiler crashes when doing “dolfinx.fem.petsc.assemble_matrix”, because some cache files get over size limit (2 GB). Please forgive my lack of expertise - I was never very good with computers

I need to get around that somehow, because for low polynomial degrees it works wonderfully, but I need to get past p to increase my accuracy. Something like 1.5*p would work.

Here’s a list of the things I tried and did not work at all (not even reducing my cache files by 1 kB):

Siplifying the symbolic equation with sp.cancel term by term
Assembling “smaller” matrices calling dolfinx.petsc.assemble_matrix on each equation term and then adding with axpy.
Toggling jit compiler options
Writing the weak forms with ufl expressions insteand of fem.Function objects

I would be extremely grateful if anyone could suggest workarounds. Right now, I assemble the symbolic equation separately and then import it. Here’s my code from that point (I paste the ufl-expressions version which is much cleaner)

with open(f'eq_A_m_{m_mode_num}_simp.pk', 'rb') as f:  # Python 3: open(..., 'wb')
    EQN_kerr_A = pickle.load(f)
with open(f'eq_B_m_{m_mode_num}_simp.pk', 'rb') as f:  # Python 3: open(..., 'wb')
    EQN_kerr_B = pickle.load(f)   

#Sort the equation terms to make SLEPc understand my problem
M_eq, C_eq, K_eq = extract_qep_coeffs(EQN_kerr_A)
M_eq_b, C_eq_b, K_eq_b = extract_qep_coeffs(EQN_kerr_B)

ufl_mapping = {
    'sigma': x[0],
    'x_coord': x[1],
    'M': M_mass_num,
    'lambda_': lambda_param,                    
    'psi': psi_gr,
    'Q': Q_ch,
    'm': m_mode_num,
    'psi_b': psi_em,
    'a': a_rot_num,
}

# Translate the SymPy expressions directly into UFL operator expressions

M_op = sympy_to_ufl(M_eq, ufl_mapping)
C_op = sympy_to_ufl(C_eq, ufl_mapping)
K_op = sympy_to_ufl(K_eq, ufl_mapping)

M_op_b = sympy_to_ufl(M_eq_b, ufl_mapping)
C_op_b = sympy_to_ufl(C_eq_b, ufl_mapping)
K_op_b = sympy_to_ufl(K_eq_b, ufl_mapping)

# Build the weak forms by taking the inner product with the test functions
M_form_gr = ufl.inner(M_op, v_test_gr)*dx_meas
M_form_em = ufl.inner(M_op_b, v_test_em)*dx_meas

C_form_gr = ufl.inner(C_op, v_test_gr)*dx_meas
C_form_em = ufl.inner(C_op_b, v_test_em)*dx_meas

K_form_gr = ufl.inner(K_op, v_test_gr)*dx_meas
K_form_em = ufl.inner(K_op_b, v_test_em)*dx_meas

M_mat = assemble_matrix(fem.form(M_form_gr+M_form_em))
M_mat.assemble()

C_mat = assemble_matrix(fem.form(C_form_gr+C_form_em))
C_mat.assemble()

K_mat = assemble_matrix(fem.form(K_form_gr+K_form_em))
K_mat.assemble()

A = [ ]

A.append(K_mat)
A.append(C_mat)
A.append(M_mat)

return A

dokken · April 15, 2026, 10:44am

We would need a reproducible example to be able to do some proper guidance here.
One thing you can try is to fix the quadrature degree to something specific (say 15), as shown in:

mariom121 · April 15, 2026, 11:06am

Thank you very much for your reply!

I here paste a reproducible example:

You will need the files containing the physical equations, I uploaded them to Google Drive and can be downloaded here (corrected version):
https://drive.google.com/drive/folders/1IhCmBS0manIEjtggFl-zs-NgmShIMoPZ?usp=sharing

I also attempted to reduce the quadrature degree, but I lost too much precission in exchange. Thanks a lot anyways for suggesting it.

import numpy as np

import os

from mpi4py import MPI

from petsc4py import PETSc

###################################################

#AUXILIARY FUNCTIONS

###################################################

######## SLEPC EIGENSOLVER #############

def eig_solver(A,ME,target,pep_params):

from slepc4py import SLEPc

from petsc4py import PETSc

from mpi4py import MPI

from dolfinx import fem

nev, tol, max_it = pep_params

comm = MPI.COMM_WORLD

V_gr, dof_map_gr = ME.sub(0).collapse()

V_em, dof_map_em = ME.sub(1).collapse()

pep = SLEPc.PEP().create(comm)

pep.setOperators(A)

pep.setProblemType(SLEPc.PEP.ProblemType.GENERAL)

pep.setDimensions(nev=nev, ncv=max(15, 2\*nev + 10), mpd=max(15, 2\*nev + 10))

pep.setType(SLEPc.PEP.Type.TOAR)

st = pep.getST()

st.setType(SLEPc.ST.Type.SINVERT)

ksp = st.getKSP()

ksp.setType(PETSc.KSP.Type.PREONLY)

pep.setRefine(ref = SLEPc.PEP.Refine.SIMPLE, npart = 1, tol = 0.1\*tol, its= 3, scheme= SLEPc.PEP.RefineScheme.SCHUR)

pep.setTarget(target)

pep.setWhichEigenpairs(SLEPc.PEP.Which.TARGET_MAGNITUDE)

pep.setTolerances(tol, max_it)



*# Solver*

pep.solve()




nconv = pep.getConverged()



all_eigvals = \[\]

efuns = fem.Function(ME) 

efuns_list_gr = \[\]

efuns_list_em = \[\]



error_list = \[\]



for i in range(nconv):   

    eigval = pep.getEigenpair(i, efuns.x.petsc_vec)

    error = pep.computeError(i)

    error_list.append(error) 

    efun_vals_gr = efuns.x.array\[dof_map_gr\]

    efun_vals_em = efuns.x.array\[dof_map_em\]

    efuns_list_gr.append(efun_vals_gr)

    efuns_list_em.append(efun_vals_em)

    all_eigvals.append(eigval)

    

return all_eigvals, efuns_list_gr, efuns_list_em, error_list

################ SYMPY SORTING ################

def extract_qep_coeffs(eqn):

import sympy as sp

omega = sp.symbols('omega', complex=True)



expanded_eqn = eqn



K_matrix_term = expanded_eqn.subs(omega, 0)

C_matrix_term = sp.diff(expanded_eqn, omega).subs(omega, 0)

M_matrix_term = sp.diff(expanded_eqn, omega, 2).subs(omega, 0) / 2



return (M_matrix_term), (C_matrix_term), (K_matrix_term)

######### SYMPY CONVERSION TO UFL #########

def sympy_to_ufl(expr, mapping):

import sympy as sp

import ufl

"""

Recursively walks a SymPy expression tree and builds a UFL expression.



Args:

    expr: The SymPy expression node.

    mapping: A dictionary linking SymPy variable/function names to UFL objects or numbers.

"""

*# 1. Base Cases: Numbers and Constants*



if expr == sp.I:

    return 1**j**

    

if isinstance(expr, (int, float)):

    return expr

    

if isinstance(expr, sp.Number):

    return float(expr)



*# 2. Base Cases: Symbols and Trial Functions*

if isinstance(expr, sp.Symbol):

    if expr.name in mapping:

        return mapping\[expr.name\]

    raise ValueError(**f**"Symbol '{expr.name}' not found in mapping dictionary.")

    

if isinstance(expr, sp.core.function.AppliedUndef):

    func_name = expr.func.\__name_\_

    if func_name in mapping:

        return mapping\[func_name\]

    raise ValueError(**f**"Function '{func_name}' not found in mapping dictionary.")



*# 3. Recursive Cases: Arithmetic Operations*

if isinstance(expr, sp.Add):

    return sum(sympy_to_ufl(arg, mapping) for arg in expr.args)



if isinstance(expr, sp.Mul):

    result = 1

    for arg in expr.args:

        result = result \* sympy_to_ufl(arg, mapping)

    return result



if isinstance(expr, sp.Pow):

    base = sympy_to_ufl(expr.args\[0\], mapping)

    exponent = sympy_to_ufl(expr.args\[1\], mapping)

    return base \*\* exponent



if isinstance(expr, sp.exp):

    base = sympy_to_ufl(expr.args\[0\], mapping)

    return ufl.exp(argum)



if isinstance(expr, sp.sin):

    argum = sympy_to_ufl(expr.args\[0\], mapping)

    return ufl.sin(argum)



if isinstance(expr, sp.log):

    argum = sympy_to_ufl(expr.args\[0\], mapping)

    return ufl.ln(argum)



if isinstance(expr, sp.cot):

    argum = sympy_to_ufl(expr.args\[0\], mapping)

    return ufl.cos(argum)/ufl.sin(argum)



if isinstance(expr, sp.cos):

    argum = sympy_to_ufl(expr.args\[0\], mapping)

    return ufl.cos(argum)



if isinstance(expr, sp.tan):

    argum = sympy_to_ufl(expr.args\[0\], mapping)

    return ufl.tan(argum)



if isinstance(expr, sp.cot):

    argum = sympy_to_ufl(expr.args\[0\], mapping)

    return ufl.cot(argum)



if isinstance(expr, sp.Abs):

    argum = sympy_to_ufl(expr.args\[0\], mapping)

    return ufl.algebra.Abs(argum)



*# 4. Calculus: Derivatives*

if isinstance(expr, sp.Derivative):

    ufl_func = sympy_to_ufl(expr.args\[0\], mapping)

    

    for var, count in expr.variable_count:

        if var.name == 'sigma':

            dim = 0

        elif var.name == 'x_coord':

            dim = 1

        else:

            raise ValueError(**f**"Unknown derivative variable: {var.name}")

        

        *# Apply the UFL derivative .dx(dim) the correct number of times*

        for \_ in range(count):

            ufl_func = ufl_func.dx(dim)

    return ufl_func



*# 5. Math Functions*



if isinstance(expr, sp.conjugate):

    return ufl.conj(sympy_to_ufl(expr.args\[0\], mapping))



*# Catch-all for unsupported operations*

raise NotImplementedError(**f**"SymPy node type {type(expr)} is not implemented in the AST walker.")

############ PROBLEM ASSEMBLER ########################

def assembler(phys_params, poly_degree, eps, debug_mode):

import dill as pickle

import numpy as np

import ufl

from mpi4py import MPI

from dolfinx.fem.petsc import assemble_matrix

from dolfinx import mesh

from dolfinx import fem

import basix

from basix import CellType, ElementFamily, LagrangeVariant

from dolfinx.fem.petsc import assemble_matrix



*#Physical parameters*

M_mass_num, alpha, q_ch = phys_params\[:\] 

Q_ch = q_ch\*M_mass_num

a_rot_num = alpha\*M_mass_num

r_p = M_mass_num + np.sqrt(M_mass_num\*\*2 - a_rot_num\*\*2 - Q_ch\*\*2)

lambda_param = r_p

*#Mesh creation*

N_x = 1

N_y = 1

*# general cell type*

cell_type = CellType.quadrilateral

*# spectral element*

ufl_nodal_element = basix.ufl.element(ElementFamily.P, 

                                    cell_type, 

                                    poly_degree, 

                                    LagrangeVariant.gll_warped)



*# mesh: we "chop-off" by a small eps to avoid coordinate singularities*

domain = mesh.create_rectangle(MPI.COMM_WORLD, 

                            \[\[eps, eps\], \[1-eps, 1-eps\]\], 

                            \[N_x, N_y\],

                            mesh.CellType.quadrilateral)



tdim = domain.topology.dim

fdim = tdim - 1

domain.topology.create_connectivity(fdim, tdim)



*#Function space: needs to be mixed since we're solving a coupled PDE system*

ME_element = basix.ufl.mixed_element(\[ufl_nodal_element, ufl_nodal_element\])

ME = fem.functionspace(domain, ME_element)

*#Trial and test functions*

v_test_gr, v_test_em = ufl.TestFunctions(ME)

psi_gr, psi_em = ufl.TrialFunctions(ME)



*# Spatial coordinate*

x = ufl.SpatialCoordinate(domain)



*# create the quadrature rule and measure*

metadata = {

    "quadrature_rule": "GLL", 

    "quadrature_degree": 2 \* poly_degree - 1

}

dx_meas = ufl.dx(domain=domain, metadata=metadata)



*# Getting back the objects:*

with open(**f**'eq_A.pk', 'rb') as f:  *# Python 3: open(..., 'wb')*

    EQN_kerr_A = pickle.load(f)

with open(**f**'eq_B.pk', 'rb') as f:  *# Python 3: open(..., 'wb')*

    EQN_kerr_B = pickle.load(f)   



*# Extract into mass, stiffness...*

M_eq, C_eq, K_eq = extract_qep_coeffs(EQN_kerr_A)

M_eq_b, C_eq_b, K_eq_b = extract_qep_coeffs(EQN_kerr_B)



*#Map for the UFL converter*

ufl_mapping = {

    'sigma': x\[0\],

    'x_coord': x\[1\],

    'M': M_mass_num,

    'lambda\_': lambda_param,         

    'psi': psi_gr,

    'Q': Q_ch,

    'm': 0,

    'psi_b': psi_em,

    'a': a_rot_num,

}



*# Translate the SymPy expressions directly into UFL operator expressions*

M_op = sympy_to_ufl(M_eq, ufl_mapping)

C_op = sympy_to_ufl(C_eq, ufl_mapping)

K_op = sympy_to_ufl(K_eq, ufl_mapping)



M_op_b = sympy_to_ufl(M_eq_b, ufl_mapping)

C_op_b = sympy_to_ufl(C_eq_b, ufl_mapping)

K_op_b = sympy_to_ufl(K_eq_b, ufl_mapping)



*# Build the weak forms by taking the inner product with the test functions*

M_form_gr = ufl.inner(M_op, v_test_gr)\*dx_meas

M_form_em = ufl.inner(M_op_b, v_test_em)\*dx_meas



C_form_gr = ufl.inner(C_op, v_test_gr)\*dx_meas

C_form_em = ufl.inner(C_op_b, v_test_em)\*dx_meas



K_form_gr = ufl.inner(K_op, v_test_gr)\*dx_meas

K_form_em = ufl.inner(K_op_b, v_test_em)\*dx_meas



jit_options = {

"cffi_extra_compile_args": \["-O1", "-fno-inline"\]}



M_1 = assemble_matrix(fem.form(M_form_gr+M_form_em, jit_options = jit_options))

M_1.assemble()

M_mat = M_1



C_1 = assemble_matrix(fem.form(C_form_gr+C_form_em, jit_options = jit_options))

C_1.assemble()

C_mat = C_1



K_1 = assemble_matrix(fem.form(K_form_gr+K_form_em, jit_options = jit_options))

K_1.assemble()

K_mat = K_1



A = \[ \]



A.append(K_mat)

A.append(C_mat)

A.append(M_mat)



return A, ME

###########################################

############ INITIALIZE

###########################################

phys_params = [0.5, 0, 0]

poly_degree = 18

eps = 2.5e-7

debug_mode = 0

A, ME = assembler(phys_params, poly_degree, eps, debug_mode = 0)

for i, mat in enumerate(A): # PEP matrices

viewer = PETSc.Viewer().createASCII("/dev/null")

arr = mat.convert("dense").getDenseArray()

print(**f**"A\_{i}: max={np.max(np.abs(arr))**:.2e**}, "

    **f**"has_nan={np.any(np.isnan(arr))}, "

    **f**"has_inf={np.any(np.isinf(arr))}")

#PARAMETERS FOR SLEPC

target = 0.85

nev = 12

tol = 1e-7

max_it = 2000

pep_params = [nev, tol, max_it]

all_eigvals, efuns_list_gr, efuns_list_em, error_list = eig_solver(A,ME,target,pep_params)

print(“\n— Sorted eigenvalues —”)

print(f"Found {len(all_eigvals)} physical eigenpairs after sorting.")

for i, eigval in enumerate(all_eigvals):

print(**f**"QNM {i}: {eigval.real**:.6e**} + {eigval.imag**:.6e**}i, error {error_list\[i\]}")

dokken · April 15, 2026, 7:38pm

Please note that your code formatting is not correct.
Please use

```python

#add code here
```

dokken · April 15, 2026, 7:42pm

Are you saying that using

and

gives you a loss of precision?

Edit
Right, you are trying to do a spectral method. At what polynomial degree does your problem currently fail?
I guess you could try using the tensor product element from basix (ffcx/demo/MassAction.py at 54893260dbf19a75fb15f3560e46a28c8cb73c58 · FEniCS/ffcx · GitHub)
i.e. if one considers:

"""Use tensor product elements to test high order assembly"""


import basix.ufl
import ufl

P = 13
cell_type = basix.CellType.hexahedron
element = basix.ufl.wrap_element(
    basix.create_tp_element(basix.ElementFamily.P, cell_type, P, basix.LagrangeVariant.gll_warped)
)
c_el = basix.ufl.blocked_element(
    basix.ufl.wrap_element(
        basix.create_tp_element(basix.ElementFamily.P, cell_type, 1, basix.LagrangeVariant.gll_warped)
    ), (3,)
)

mesh = ufl.Mesh(c_el)
V = ufl.FunctionSpace(mesh, element)
x = ufl.SpatialCoordinate(mesh)

v = ufl.TestFunction(V)
u = ufl.TrialFunction(V)
a = ufl.inner(u, v) * ufl.dx

w = ufl.Coefficient(V)
L = ufl.action(a, w)

which can be compiled with:

python3 -m ffcx mwe.py --sum_factorization

you get the following (seems like the quadrature weights does not use tensor products).

void tabulate_tensor_integral_54a463af79997cceebe0a7740ccc42adfa5a9ff6_hexahedron(double* restrict A,
                                    const double* restrict w,
                                    const double* restrict c,
                                    const double* restrict coordinate_dofs,
                                    const int* restrict entity_local_index,
                                    const uint8_t* restrict quadrature_permutation,
                                    void* custom_data)
{
// Quadrature rules
static const double weights_843[4913] = {...,
  {0.002514143263164119, 0.5310179651539475, -0.006306936264939108, 0.008673660364852594, -0.01084094025003951, 0.01309572661657724, -0.01564139383462328, 0.01869763014686473, -0.02256350657595175, 0.0277055433285752, -0.03492972494947359, 0.04577976819603195, -0.06362108595724893, 0.09732164084031523, -0.1793847948263935, 0.5884823047483395}}}};
static const double FE_TF1[1][1][17][2] = {{{{-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0}}}};
static const double FE_TF2[1][1][17][2] = {{{{0.9952877376572087, 0.004712262342791262},
  {0.9753377608843838, 0.0246622391156161},
  {0.940119576863493, 0.05988042313650699},
  {0.8907570019484007, 0.1092429980515993},
  {0.8288355796083454, 0.1711644203916546},
  {0.7563452685432385, 0.2436547314567615},
  {0.6756158817269382, 0.3243841182730618},
  {0.5892420907479239, 0.4107579092520761},
  {0.5, 0.5},
  {0.4107579092520761, 0.5892420907479239},
  {0.3243841182730618, 0.6756158817269382},
  {0.2436547314567615, 0.7563452685432385},
  {0.1711644203916547, 0.8288355796083453},
  {0.1092429980515995, 0.8907570019484006},
  {0.05988042313650704, 0.9401195768634929},
  {0.0246622391156161, 0.9753377608843838},
  {0.004712262342791262, 0.9952877376572087}}}};
for (int iq0 = 0; iq0 < 17; ++iq0)
{
  for (int iq1 = 0; iq1 < 17; ++iq1)
  {
    for (int iq2 = 0; iq2 < 17; ++iq2)
    {
      // ------------------------ 
      // Section: Jacobian
      // Inputs: FE_TF2, FE_TF1, coordinate_dofs
      // Outputs: J0_c5, J0_c3, J0_c1, J0_c4, J0_c8, J0_c2, J0_c0, J0_c7, J0_c6
      double J0_c0 = 0.0;
      double J0_c4 = 0.0;
      double J0_c8 = 0.0;
      double J0_c5 = 0.0;
      double J0_c7 = 0.0;
      double J0_c3 = 0.0;
      double J0_c6 = 0.0;
      double J0_c1 = 0.0;
      double J0_c2 = 0.0;
      {
        for (int ic0 = 0; ic0 < 2; ++ic0)
        {
          for (int ic1 = 0; ic1 < 2; ++ic1)
          {
            for (int ic2 = 0; ic2 < 2; ++ic2)
            {
              J0_c0 += coordinate_dofs[(4 * ic0 + 2 * ic1 + ic2) * 3] * (FE_TF1[0][0][iq0][ic0] * FE_TF2[0][0][iq1][ic1] * FE_TF2[0][0][iq2][ic2]);
            }
          }
          for (int ic1 = 0; ic1 < 2; ++ic1)
          {
            for (int ic2 = 0; ic2 < 2; ++ic2)
            {
              J0_c4 += coordinate_dofs[(4 * ic0 + 2 * ic1 + ic2) * 3 + 1] * (FE_TF2[0][0][iq0][ic0] * FE_TF1[0][0][iq1][ic1] * FE_TF2[0][0][iq2][ic2]);
            }
          }
          for (int ic1 = 0; ic1 < 2; ++ic1)
          {
            for (int ic2 = 0; ic2 < 2; ++ic2)
            {
              J0_c8 += coordinate_dofs[(4 * ic0 + 2 * ic1 + ic2) * 3 + 2] * (FE_TF2[0][0][iq0][ic0] * FE_TF2[0][0][iq1][ic1] * FE_TF1[0][0][iq2][ic2]);
            }
          }
          for (int ic1 = 0; ic1 < 2; ++ic1)
          {
            for (int ic2 = 0; ic2 < 2; ++ic2)
            {
              J0_c5 += coordinate_dofs[(4 * ic0 + 2 * ic1 + ic2) * 3 + 1] * (FE_TF2[0][0][iq0][ic0] * FE_TF2[0][0][iq1][ic1] * FE_TF1[0][0][iq2][ic2]);
            }
          }
          for (int ic1 = 0; ic1 < 2; ++ic1)
          {
            for (int ic2 = 0; ic2 < 2; ++ic2)
            {
              J0_c7 += coordinate_dofs[(4 * ic0 + 2 * ic1 + ic2) * 3 + 2] * (FE_TF2[0][0][iq0][ic0] * FE_TF1[0][0][iq1][ic1] * FE_TF2[0][0][iq2][ic2]);
            }
          }
          for (int ic1 = 0; ic1 < 2; ++ic1)
          {
            for (int ic2 = 0; ic2 < 2; ++ic2)
            {
              J0_c3 += coordinate_dofs[(4 * ic0 + 2 * ic1 + ic2) * 3 + 1] * (FE_TF1[0][0][iq0][ic0] * FE_TF2[0][0][iq1][ic1] * FE_TF2[0][0][iq2][ic2]);
            }
          }
          for (int ic1 = 0; ic1 < 2; ++ic1)
          {
            for (int ic2 = 0; ic2 < 2; ++ic2)
            {
              J0_c6 += coordinate_dofs[(4 * ic0 + 2 * ic1 + ic2) * 3 + 2] * (FE_TF1[0][0][iq0][ic0] * FE_TF2[0][0][iq1][ic1] * FE_TF2[0][0][iq2][ic2]);
            }
          }
          for (int ic1 = 0; ic1 < 2; ++ic1)
          {
            for (int ic2 = 0; ic2 < 2; ++ic2)
            {
              J0_c1 += coordinate_dofs[(4 * ic0 + 2 * ic1 + ic2) * 3] * (FE_TF2[0][0][iq0][ic0] * FE_TF1[0][0][iq1][ic1] * FE_TF2[0][0][iq2][ic2]);
            }
          }
          for (int ic1 = 0; ic1 < 2; ++ic1)
          {
            for (int ic2 = 0; ic2 < 2; ++ic2)
            {
              J0_c2 += coordinate_dofs[(4 * ic0 + 2 * ic1 + ic2) * 3] * (FE_TF2[0][0][iq0][ic0] * FE_TF2[0][0][iq1][ic1] * FE_TF1[0][0][iq2][ic2]);
            }
          }
        }
      }
      // ------------------------ 
      // ------------------------ 
      // Section: Intermediates
      // Inputs: J0_c5, J0_c3, J0_c1, J0_c4, J0_c8, J0_c2, J0_c0, J0_c7, J0_c6
      // Outputs: fw0
      double fw0 = 0;
      {
        double sv_843_0 = J0_c4 * J0_c8;
        double sv_843_1 = J0_c5 * J0_c7;
        double sv_843_2 = -sv_843_1;
        double sv_843_3 = sv_843_0 + sv_843_2;
        double sv_843_4 = J0_c0 * sv_843_3;
        double sv_843_5 = J0_c3 * J0_c8;
        double sv_843_6 = J0_c5 * J0_c6;
        double sv_843_7 = -sv_843_6;
        double sv_843_8 = sv_843_5 + sv_843_7;
        double sv_843_9 = -J0_c1;
        double sv_843_10 = sv_843_8 * sv_843_9;
        double sv_843_11 = sv_843_4 + sv_843_10;
        double sv_843_12 = J0_c3 * J0_c7;
        double sv_843_13 = J0_c4 * J0_c6;
        double sv_843_14 = -sv_843_13;
        double sv_843_15 = sv_843_12 + sv_843_14;
        double sv_843_16 = J0_c2 * sv_843_15;
        double sv_843_17 = sv_843_11 + sv_843_16;
        double sv_843_18 = fabs(sv_843_17);
        fw0 = sv_843_18 * weights_843[289 * iq0 + 17 * iq1 + iq2];
      }
      // ------------------------ 
      // ------------------------ 
      // Section: Tensor Computation
      // Inputs: FE_TF0, fw0
      // Outputs: A
      {
        for (int j0 = 0; j0 < 16; ++j0)
        {
          for (int j1 = 0; j1 < 16; ++j1)
          {
            for (int j2 = 0; j2 < 16; ++j2)
            {
              for (int i0 = 0; i0 < 16; ++i0)
              {
                for (int i1 = 0; i1 < 16; ++i1)
                {
                  for (int i2 = 0; i2 < 16; ++i2)
                  {
                    A[4096 * (256 * i0 + 16 * i1 + i2) + (256 * j0 + 16 * j1 + j2)] += fw0 * (FE_TF0[0][0][iq0][i0] * FE_TF0[0][0][iq1][i1] * FE_TF0[0][0][iq2][i2]) * (FE_TF0[0][0][iq0][j0] * FE_TF0[0][0][iq1][j1] * FE_TF0[0][0][iq2][j2]);
                  }
                }
              }
            }
          }
        }
      }
      // ------------------------ 
    }
  }
}

}

dokken · April 15, 2026, 8:17pm

Here is a minimal example setting up a grid with tensor factors

"""Use tensor product elements to test high order assembly"""

from mpi4py import MPI
import basix.ufl
import ufl
import dolfinx
import numpy as np
P = 15
cell_type = basix.CellType.quadrilateral
element = basix.ufl.wrap_element(
    basix.create_tp_element(basix.ElementFamily.P, cell_type, P, basix.LagrangeVariant.gll_warped)
)
c_el = basix.ufl.blocked_element(
    basix.ufl.wrap_element(
        basix.create_tp_element(basix.ElementFamily.P, cell_type, 1, basix.LagrangeVariant.gll_warped)
    ), (2,)
)

eps = 2.5e-7
nodes = np.array([[eps, eps],[1-eps, eps],[eps, 1-eps],[1-eps, 1-eps]], dtype=np.float64)
connectivity = np.array([[0, 1, 2, 3]], dtype=np.int64)
mesh = dolfinx.mesh.create_mesh(MPI.COMM_WORLD, cells=connectivity, x=nodes, e=c_el)

V = dolfinx.fem.functionspace(mesh, element)

v = ufl.TestFunction(V)
u = ufl.TrialFunction(V)
a = ufl.inner(u, v) * ufl.dx

J = dolfinx.fem.assemble_matrix(dolfinx.fem.form(a, form_compiler_options={"sum_factorization": True}))

which then in turn gives the kernel

void tabulate_tensor_integral_b6ab92259bf3ae452d1f0a4bb20de89952479a6b_quadrilateral(double _Complex* restrict A,
                                    const double _Complex* restrict w,
                                    const double _Complex* restrict c,
                                    const double* restrict coordinate_dofs,
                                    const int* restrict entity_local_index,
                                    const uint8_t* restrict quadrature_permutation,
                                    void* custom_data)
{
// Quadrature rules
static const double weights_06e[289] = {0.0001457851328577798, 0.0003348133780675438, 0.0005133696660845011, 0.0006754512570311628, 0.0008158284885834105, 0.0009299859235246961, 0.001014253485508105, 0.001065922421123082, 0.001083331928713395, 0.001065922421123082, 0.001014253485508105, 0.0009299859235246961, 0.0008158284885834105, 0.0006754512570311648, 0.0005133696660845011, 0.0003348133780675425, 0.0001457851328577798, 0.0003348133780675438, 0.0007689398495960406, 0.001179016191361836, 0.001551256377474323, 0.001873649849143541, 0.002135826352844399, 0.002329357109624001, 0.00244802113616285, 0.002488004198444595, 0.00244802113616285, 0.002329357109624001, 0.002135826352844399, 0.001873649849143541, 0.001551256377474328, 0.001179016191361836, 0.0007689398495960377, 0.0003348133780675438, 0.0005133696660845011, 0.001179016191361836, 0.001807786630155326, 0.002378542856058727, 0.002872869068033629, 0.003274864546640099, 0.003571605437217603, 0.0037535531002175, 0.003814859167051171, 0.0037535531002175, 0.003571605437217603, 0.003274864546640099, 0.002872869068033629, 0.002378542856058734, 0.001807786630155326, 0.001179016191361832, 0.0005133696660845011, 0.0006754512570311628, 0.001551256377474323, 0.002378542856058727, 0.003129498815699238, 0.003779894200001008, 0.004308808098277363, 0.004699236323384317, 0.004938628686833649, 0.005019290367182374, 0.004938628686833649, 0.004699236323384317, 0.004308808098277363, 0.003779894200001008, 0.003129498815699247, 0.002378542856058727, 0.001551256377474317, 0.0006754512570311628, 0.0008158284885834105, 0.001873649849143541, 0.002872869068033629, 0.003779894200001008, 0.004565459520715274, 0.005204296182472571, 0.005675866063309457, 0.005965010702568406, 0.006062436084608166, 0.005965010702568406, 0.005675866063309457, 0.005204296182472571, 0.004565459520715274, 0.003779894200001019, 0.002872869068033629, 0.001873649849143534, 0.0008158284885834105, 0.0009299859235246961, 0.002135826352844399, 0.003274864546640099, 0.004308808098277363, 0.005204296182472571, 0.005932524126433432, 0.006470079945179132, 0.006799684081509751, 0.006910742024642279, 0.006799684081509751, 0.006470079945179132, 0.005932524126433432, 0.005204296182472571, 0.004308808098277375, 0.003274864546640099, 0.00213582635284439, 0.0009299859235246961, 0.001014253485508105, 0.002329357109624001, 0.003571605437217603, 0.004699236323384317, 0.005675866063309457, 0.006470079945179132, 0.007056344585348723, 0.007415814697373854, 0.007536935784334625, 0.007415814697373854, 0.007056344585348723, 0.006470079945179132, 0.005675866063309457, 0.004699236323384331, 0.003571605437217603, 0.002329357109623993, 0.001014253485508105, 0.001065922421123082, 0.00244802113616285, 0.0037535531002175, 0.004938628686833649, 0.005965010702568406, 0.006799684081509751, 0.007415814697373854, 0.007793597231627863, 0.007920888568662418, 0.007793597231627863, 0.007415814697373854, 0.006799684081509751, 0.005965010702568406, 0.004938628686833664, 0.0037535531002175, 0.00244802113616284, 0.001065922421123082, 0.001083331928713395, 0.002488004198444595, 0.003814859167051171, 0.005019290367182374, 0.006062436084608166, 0.006910742024642279, 0.007536935784334625, 0.007920888568662418, 0.008050258930825227, 0.007920888568662418, 0.007536935784334625, 0.006910742024642279, 0.006062436084608166, 0.005019290367182388, 0.003814859167051171, 0.002488004198444586, 0.001083331928713395, 0.001065922421123082, 0.00244802113616285, 0.0037535531002175, 0.004938628686833649, 0.005965010702568406, 0.006799684081509751, 0.007415814697373854, 0.007793597231627863, 0.007920888568662418, 0.007793597231627863, 0.007415814697373854, 0.006799684081509751, 0.005965010702568406, 0.004938628686833664, 0.0037535531002175, 0.00244802113616284, 0.001065922421123082, 0.001014253485508105, 0.002329357109624001, 0.003571605437217603, 0.004699236323384317, 0.005675866063309457, 0.006470079945179132, 0.007056344585348723, 0.007415814697373854, 0.007536935784334625, 0.007415814697373854, 0.007056344585348723, 0.006470079945179132, 0.005675866063309457, 0.004699236323384331, 0.003571605437217603, 0.002329357109623993, 0.001014253485508105, 0.0009299859235246961, 0.002135826352844399, 0.003274864546640099, 0.004308808098277363, 0.005204296182472571, 0.005932524126433432, 0.006470079945179132, 0.006799684081509751, 0.006910742024642279, 0.006799684081509751, 0.006470079945179132, 0.005932524126433432, 0.005204296182472571, 0.004308808098277375, 0.003274864546640099, 0.00213582635284439, 0.0009299859235246961, 0.0008158284885834105, 0.001873649849143541, 0.002872869068033629, 0.003779894200001008, 0.004565459520715274, 0.005204296182472571, 0.005675866063309457, 0.005965010702568406, 0.006062436084608166, 0.005965010702568406, 0.005675866063309457, 0.005204296182472571, 0.004565459520715274, 0.003779894200001019, 0.002872869068033629, 0.001873649849143534, 0.0008158284885834105, 0.0006754512570311648, 0.001551256377474328, 0.002378542856058734, 0.003129498815699247, 0.003779894200001019, 0.004308808098277375, 0.004699236323384331, 0.004938628686833664, 0.005019290367182388, 0.004938628686833664, 0.004699236323384331, 0.004308808098277375, 0.003779894200001019, 0.003129498815699256, 0.002378542856058734, 0.001551256377474322, 0.0006754512570311648, 0.0005133696660845011, 0.001179016191361836, 0.001807786630155326, 0.002378542856058727, 0.002872869068033629, 0.003274864546640099, 0.003571605437217603, 0.0037535531002175, 0.003814859167051171, 0.0037535531002175, 0.003571605437217603, 0.003274864546640099, 0.002872869068033629, 0.002378542856058734, 0.001807786630155326, 0.001179016191361832, 0.0005133696660845011, 0.0003348133780675425, 0.0007689398495960377, 0.001179016191361832, 0.001551256377474317, 0.001873649849143534, 0.00213582635284439, 0.002329357109623993, 0.00244802113616284, 0.002488004198444586, 0.00244802113616284, 0.002329357109623993, 0.00213582635284439, 0.001873649849143534, 0.001551256377474322, 0.001179016191361832, 0.0007689398495960349, 0.0003348133780675425, 0.0001457851328577798, 0.0003348133780675438, 0.0005133696660845011, 0.0006754512570311628, 0.0008158284885834105, 0.0009299859235246961, 0.001014253485508105, 0.001065922421123082, 0.001083331928713395, 0.001065922421123082, 0.001014253485508105, 0.0009299859235246961, 0.0008158284885834105, 0.0006754512570311648, 0.0005133696660845011, 0.0003348133780675425, 0.0001457851328577798};
// Precomputed values of basis functions and precomputations
// FE* dimensions: [permutation][entities][points][dofs]
static const double FE_TF0[1][1][17][16] = {{{{0.5310179651539434, 0.002514143263163935, 0.5884823047483387, -0.1793847948263871, 0.09732164084031325, -0.06362108595724753, 0.04577976819603085, -0.03492972494947335, 0.02770554332857452, -0.02256350657595092, 0.01869763014686472, -0.01564139383462264, 0.01309572661657683, -0.01084094025003909, 0.008673660364852842, -0.006306936264939094},
  {-0.1279643740930969, -0.003235687285707545, 0.825276019527148, 0.4016052578050424, -0.1536075508159097, 0.09097174773003154, -0.06269562416475484, 0.04675394028868764, -0.03658362653399562, 0.02953633333506332, -0.02433380644796956, 0.02027471838678221, -0.0169273774697448, 0.01398560334933942, -0.01117514119019908, 0.008119567579283157},
  {0.04506847775101008, 0.002870613041430765, -0.1492565638184692, 0.9323133249205195, 0.236219340013864, -0.1018419901991895, 0.06319363895274897, -0.04483303014671571, 0.0341130338495718, -0.02707010731090384, 0.02205080070300053, -0.01823148475814101, 0.01514049433573509, -0.01246336839301106, 0.009934557924711289, -0.007207736866162262},
  {-0.01229554036123029, -0.001507932790634431, 0.03528793100807117, -0.07476331401308049, 0.9884767900729207, 0.08944352429381004, -0.04196240826588792, 0.02688770122719832, -0.01944302869602156, 0.01497995718888914, -0.01197709268493107, 0.009780881755744455, -0.008054536715507075, 0.006592445566051996, -0.00523505612868456, 0.003789678543292256},
  {-0.003582234564908075, -0.000739774109720467, 0.00971234526043426, -0.01662921035517776, 0.03524923230686862, 0.9980461370946736, -0.03297846445881575, 0.01652159743212438, -0.01085112819950925, 0.007953873646228531, -0.006174506294946994, 0.004948273372891968, -0.004024346742651605, 0.003266432998359481, -0.002579816169174316, 0.001861588783323459},
  {0.01139891269024134, 0.003672131136338329, -0.03003351311013579, 0.04707080384154232, -0.07679234987354021, 0.1709490109412144, 0.9626287483056376, -0.1258109160332868, 0.06647574244713014, -0.04453582436867183, 0.03295929578191496, -0.02566796309725861, 0.02049754424751977, -0.01644016709043109, 0.01288579309629175, -0.009257248914506041},
  {-0.01455857691406362, -0.006990023863127238, 0.03773300702719259, -0.05645426657445396, 0.08274433532555152, -0.1348356252830777, 0.3233027662258365, 0.8858877501707699, -0.1860605036384325, 0.1027438196202606, -0.06994074440908214, 0.05206092659087528, -0.04046019944613791, 0.03190210201316634, -0.02473958412712687, 0.01766481728184915},
  {0.01476942609043248, 0.01029569794319792, -0.03788738811311202, 0.055139045998481, -0.07636531428674226, 0.1100722191402691, -0.1820572953998159, 0.4820557925418167, 0.7741419307270463, -0.2134701215720267, 0.1222953984338127, -0.08426281753751952, 0.06278183258882505, -0.04827648106171816, 0.03687498906184535, -0.02610691455479212},
  {-0.01309204101562505, -0.01309204101562488, 0.03335544248995533, -0.04768643702639053, 0.06383014874598097, -0.08627598963282575, 0.1243076624552219, -0.2105115081265031, 0.6360727221101858, 0.6360727221101865, -0.2105115081265024, 0.1243076624552216, -0.08627598963282571, 0.06383014874598111, -0.04768643702639108, 0.03335544248995551},
  {0.01029569794319817, 0.01476942609043239, -0.02610691455479202, 0.03687498906184507, -0.04827648106171822, 0.06278183258882525, -0.08426281753751999, 0.1222953984338136, -0.2134701215720279, 0.7741419307270471, 0.4820557925418165, -0.182057295399816, 0.1100722191402696, -0.07636531428674288, 0.05513904599848175, -0.03788738811311237},
  {-0.006990023863127413, -0.01455857691406353, 0.01766481728184911, -0.02473958412712672, 0.03190210201316653, -0.0404601994461381, 0.05206092659087558, -0.06994074440908271, 0.1027438196202611, -0.1860605036384328, 0.8858877501707697, 0.3233027662258372, -0.1348356252830784, 0.08274433532555206, -0.05645426657445486, 0.03773300702719305},
  {0.003672131136338363, 0.01139891269024114, -0.00925724891450588, 0.01288579309629153, -0.01644016709043094, 0.02049754424751953, -0.0256679630972587, 0.03295929578191473, -0.04453582436867162, 0.06647574244712956, -0.1258109160332856, 0.962628748305638, 0.1709490109412131, -0.07679234987353983, 0.04707080384154243, -0.0300335131101359},
  {-0.0007397741097204323, -0.003582234564907856, 0.001861588783323354, -0.002579816169174141, 0.003266432998359122, -0.004024346742651431, 0.004948273372891867, -0.006174506294946454, 0.007953873646227993, -0.01085112819950874, 0.01652159743212317, -0.03297846445881378, 0.998046137094674, 0.03524923230686623, -0.01662921035517689, 0.009712345260433847},
  {-0.001507932790634611, -0.01229554036123084, 0.003789678543292389, -0.005235056128685074, 0.006592445566052121, -0.008054536715507425, 0.009780881755745248, -0.01197709268493141, 0.01497995718889036, -0.01944302869602261, 0.0268877012271997, -0.04196240826589059, 0.08944352429381575, 0.988476790072919, -0.07476331401308516, 0.03528793100807356},
  {0.002870613041430889, 0.04506847775100992, -0.007207736866162262, 0.009934557924711391, -0.01246336839301107, 0.01514049433573512, -0.01823148475814129, 0.02205080070300094, -0.02707010731090406, 0.03411303384957206, -0.044833030146716, 0.06319363895274903, -0.1018419901991901, 0.2362193400138683, 0.9323133249205199, -0.1492565638184722},
  {-0.003235687285707548, -0.1279643740930907, 0.008119567579282649, -0.01117514119019856, 0.0139856033493392, -0.01692737746974434, 0.0202747183867812, -0.024333806447969, 0.02953633333506164, -0.03658362653399398, 0.0467539402886867, -0.06269562416475231, 0.09097174773002825, -0.1536075508159051, 0.4016052578050343, 0.8252760195271474},
  {0.002514143263164119, 0.5310179651539475, -0.006306936264939108, 0.008673660364852594, -0.01084094025003951, 0.01309572661657724, -0.01564139383462328, 0.01869763014686473, -0.02256350657595175, 0.0277055433285752, -0.03492972494947359, 0.04577976819603195, -0.06362108595724893, 0.09732164084031523, -0.1793847948263935, 0.5884823047483395}}}};
static const double FE_TF1[1][1][17][2] = {{{{-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0},
  {-1.0, 1.0}}}};
static const double FE_TF2[1][1][17][2] = {{{{0.9952877376572087, 0.004712262342791262},
  {0.9753377608843838, 0.0246622391156161},
  {0.940119576863493, 0.05988042313650699},
  {0.8907570019484007, 0.1092429980515993},
  {0.8288355796083454, 0.1711644203916546},
  {0.7563452685432385, 0.2436547314567615},
  {0.6756158817269382, 0.3243841182730618},
  {0.5892420907479239, 0.4107579092520761},
  {0.5, 0.5},
  {0.4107579092520761, 0.5892420907479239},
  {0.3243841182730618, 0.6756158817269382},
  {0.2436547314567615, 0.7563452685432385},
  {0.1711644203916547, 0.8288355796083453},
  {0.1092429980515995, 0.8907570019484006},
  {0.05988042313650704, 0.9401195768634929},
  {0.0246622391156161, 0.9753377608843838},
  {0.004712262342791262, 0.9952877376572087}}}};
for (int iq0 = 0; iq0 < 17; ++iq0)
{
  for (int iq1 = 0; iq1 < 17; ++iq1)
  {
    // ------------------------ 
    // Section: Jacobian
    // Inputs: FE_TF1, coordinate_dofs, FE_TF2
    // Outputs: J0_c3, J0_c1, J0_c0, J0_c2
    double J0_c0 = 0.0;
    double J0_c3 = 0.0;
    double J0_c1 = 0.0;
    double J0_c2 = 0.0;
    {
      for (int ic0 = 0; ic0 < 2; ++ic0)
      {
        for (int ic1 = 0; ic1 < 2; ++ic1)
        {
          J0_c0 += coordinate_dofs[(2 * ic0 + ic1) * 3] * (FE_TF1[0][0][iq0][ic0] * FE_TF2[0][0][iq1][ic1]);
        }
        for (int ic1 = 0; ic1 < 2; ++ic1)
        {
          J0_c3 += coordinate_dofs[(2 * ic0 + ic1) * 3 + 1] * (FE_TF2[0][0][iq0][ic0] * FE_TF1[0][0][iq1][ic1]);
        }
        for (int ic1 = 0; ic1 < 2; ++ic1)
        {
          J0_c1 += coordinate_dofs[(2 * ic0 + ic1) * 3] * (FE_TF2[0][0][iq0][ic0] * FE_TF1[0][0][iq1][ic1]);
        }
        for (int ic1 = 0; ic1 < 2; ++ic1)
        {
          J0_c2 += coordinate_dofs[(2 * ic0 + ic1) * 3 + 1] * (FE_TF1[0][0][iq0][ic0] * FE_TF2[0][0][iq1][ic1]);
        }
      }
    }
    // ------------------------ 
    // ------------------------ 
    // Section: Intermediates
    // Inputs: J0_c3, J0_c1, J0_c0, J0_c2
    // Outputs: fw0
    double _Complex fw0 = 0;
    {
      double sv_06e_0 = J0_c0 * J0_c3;
      double sv_06e_1 = J0_c1 * J0_c2;
      double sv_06e_2 = -sv_06e_1;
      double sv_06e_3 = sv_06e_0 + sv_06e_2;
      double sv_06e_4 = fabs(sv_06e_3);
      fw0 = sv_06e_4 * weights_06e[17 * iq0 + iq1];
    }
    // ------------------------ 
    // ------------------------ 
    // Section: Tensor Computation
    // Inputs: FE_TF0, fw0
    // Outputs: A
    {
      for (int j0 = 0; j0 < 16; ++j0)
      {
        for (int j1 = 0; j1 < 16; ++j1)
        {
          for (int i0 = 0; i0 < 16; ++i0)
          {
            for (int i1 = 0; i1 < 16; ++i1)
            {
              A[256 * (16 * i0 + i1) + (16 * j0 + j1)] += fw0 * (FE_TF0[0][0][iq0][i0] * FE_TF0[0][0][iq1][i1]) * (FE_TF0[0][0][iq0][j0] * FE_TF0[0][0][iq1][j1]);
            }
          }
        }
      }
    }
    // ------------------------ 
  }
}

}

Turning off tensor product factorization yields massive tables such as:

 FE2_C0_Q06e[1][1][289][256]

mariom121 · April 15, 2026, 8:35pm

Dear Mr J. S. Dokken,

First of all, I am immensely grateful for your help. Please excuse my lackluster code-pasting, I’m still quite new here (I below paste the code correctly)

Exactly! I’m trying to do a spectral method (to solve an eigenvalue problem with SLEPc), based on this post here: Does Fenicsx support spectral element methods? - FEniCS Project

The ffcx c++ file (or at least that’s what I think it is, in fact they are three files called something like libffcx_forms_218d0c58cb3ff7… and with .o, .c, and .so extensions) gets over 2 GB at p => 80. Normally, I use quadrature degree = 2p - 1 (as in the example above), I tried things like int(1.5*p) but there was a significant loss in accuracy.

I will try assembling with tensor product elements - that might be my way out of this (I spent the whole evening battling with sympy and testing different workarounds )

I’ll try and come back here to poste whether it works. Once again, thanks a lot! I leave my code (now hopefully correctly pasted) here

import numpy as np
import os
from mpi4py import MPI
from petsc4py import PETSc

###################################################
#AUXILIARY FUNCTIONS
###################################################

######## SLEPC EIGENSOLVER #############
def eig_solver(A,ME,target,pep_params):
    from slepc4py import SLEPc
    from petsc4py import PETSc
    from mpi4py import MPI
    from dolfinx import fem

    nev, tol, max_it = pep_params
    comm = MPI.COMM_WORLD
    V_gr, dof_map_gr = ME.sub(0).collapse()
    V_em, dof_map_em = ME.sub(1).collapse()
    pep = SLEPc.PEP().create(comm)
    pep.setOperators(A)
    pep.setProblemType(SLEPc.PEP.ProblemType.GENERAL)
    pep.setDimensions(nev=nev, ncv=max(15, 2*nev + 10), mpd=max(15, 2*nev + 10))
    pep.setType(SLEPc.PEP.Type.TOAR)
    st = pep.getST()
    st.setType(SLEPc.ST.Type.SINVERT)
    ksp = st.getKSP()
    ksp.setType(PETSc.KSP.Type.PREONLY)
    pep.setRefine(ref = SLEPc.PEP.Refine.SIMPLE, npart = 1, tol = 0.1*tol, its= 3, scheme= SLEPc.PEP.RefineScheme.SCHUR)
    pep.setTarget(target)
    pep.setWhichEigenpairs(SLEPc.PEP.Which.TARGET_MAGNITUDE)
    pep.setTolerances(tol, max_it)

    # Solver
    pep.solve()


    nconv = pep.getConverged()

    all_eigvals = []
    efuns = fem.Function(ME) 
    efuns_list_gr = []
    efuns_list_em = []

    error_list = []

    for i in range(nconv):   
        eigval = pep.getEigenpair(i, efuns.x.petsc_vec)
        error = pep.computeError(i)
        error_list.append(error) 
        efun_vals_gr = efuns.x.array[dof_map_gr]
        efun_vals_em = efuns.x.array[dof_map_em]
        efuns_list_gr.append(efun_vals_gr)
        efuns_list_em.append(efun_vals_em)
        all_eigvals.append(eigval)
        
    return all_eigvals, efuns_list_gr, efuns_list_em, error_list


################ SYMPY SORTING ################

def extract_qep_coeffs(eqn):
    import sympy as sp
    omega = sp.symbols('omega', complex=True)

    expanded_eqn = eqn

    K_matrix_term = expanded_eqn.subs(omega, 0)
    C_matrix_term = sp.diff(expanded_eqn, omega).subs(omega, 0)
    M_matrix_term = sp.diff(expanded_eqn, omega, 2).subs(omega, 0) / 2
    
    return (M_matrix_term), (C_matrix_term), (K_matrix_term)

######### SYMPY CONVERSION TO UFL #########

def sympy_to_ufl(expr, mapping):
    import sympy as sp
    import ufl
    """
    Recursively walks a SymPy expression tree and builds a UFL expression.
    
    Args:
        expr: The SymPy expression node.
        mapping: A dictionary linking SymPy variable/function names to UFL objects or numbers.
    """
    # 1. Base Cases: Numbers and Constants

    if expr == sp.I:
        return 1j
        
    if isinstance(expr, (int, float)):
        return expr
        
    if isinstance(expr, sp.Number):
        return float(expr)
    
    # 2. Base Cases: Symbols and Trial Functions
    if isinstance(expr, sp.Symbol):
        if expr.name in mapping:
            return mapping[expr.name]
        raise ValueError(f"Symbol '{expr.name}' not found in mapping dictionary.")
        
    if isinstance(expr, sp.core.function.AppliedUndef):
        func_name = expr.func.__name__
        if func_name in mapping:
            return mapping[func_name]
        raise ValueError(f"Function '{func_name}' not found in mapping dictionary.")

    # 3. Recursive Cases: Arithmetic Operations
    if isinstance(expr, sp.Add):
        return sum(sympy_to_ufl(arg, mapping) for arg in expr.args)
    
    if isinstance(expr, sp.Mul):
        result = 1
        for arg in expr.args:
            result = result * sympy_to_ufl(arg, mapping)
        return result
    
    if isinstance(expr, sp.Pow):
        base = sympy_to_ufl(expr.args[0], mapping)
        exponent = sympy_to_ufl(expr.args[1], mapping)
        return base ** exponent
    
    if isinstance(expr, sp.exp):
        base = sympy_to_ufl(expr.args[0], mapping)
        return ufl.exp(argum)

    if isinstance(expr, sp.sin):
        argum = sympy_to_ufl(expr.args[0], mapping)
        return ufl.sin(argum)
    
    if isinstance(expr, sp.log):
        argum = sympy_to_ufl(expr.args[0], mapping)
        return ufl.ln(argum)
    
    if isinstance(expr, sp.cot):
        argum = sympy_to_ufl(expr.args[0], mapping)
        return ufl.cos(argum)/ufl.sin(argum)

    if isinstance(expr, sp.cos):
        argum = sympy_to_ufl(expr.args[0], mapping)
        return ufl.cos(argum)

    if isinstance(expr, sp.tan):
        argum = sympy_to_ufl(expr.args[0], mapping)
        return ufl.tan(argum)

    if isinstance(expr, sp.cot):
        argum = sympy_to_ufl(expr.args[0], mapping)
        return ufl.cot(argum)

    if isinstance(expr, sp.Abs):
        argum = sympy_to_ufl(expr.args[0], mapping)
        return ufl.algebra.Abs(argum)
    
    # 4. Calculus: Derivatives
    if isinstance(expr, sp.Derivative):
        ufl_func = sympy_to_ufl(expr.args[0], mapping)
        
        for var, count in expr.variable_count:
            if var.name == 'sigma':
                dim = 0
            elif var.name == 'x_coord':
                dim = 1
            else:
                raise ValueError(f"Unknown derivative variable: {var.name}")
            
            # Apply the UFL derivative .dx(dim) the correct number of times
            for _ in range(count):
                ufl_func = ufl_func.dx(dim)
        return ufl_func

    # 5. Math Functions

    if isinstance(expr, sp.conjugate):
        return ufl.conj(sympy_to_ufl(expr.args[0], mapping))

    # Catch-all for unsupported operations
    raise NotImplementedError(f"SymPy node type {type(expr)} is not implemented in the AST walker.")


############ PROBLEM ASSEMBLER ########################
def assembler(phys_params, poly_degree, eps, debug_mode):
    import dill as pickle
    import numpy as np
    import ufl
    from mpi4py import MPI
    from dolfinx.fem.petsc import assemble_matrix
    from dolfinx import mesh
    from dolfinx import fem
    import basix
    from basix import CellType, ElementFamily, LagrangeVariant
    from dolfinx.fem.petsc import assemble_matrix

    #Physical parameters
    M_mass_num, alpha, q_ch = phys_params[:] 
    Q_ch = q_ch*M_mass_num
    a_rot_num = alpha*M_mass_num
    r_p = M_mass_num + np.sqrt(M_mass_num**2 - a_rot_num**2 - Q_ch**2)
    lambda_param = r_p
    #Mesh creation
    N_x = 1
    N_y = 1
    # general cell type
    cell_type = CellType.quadrilateral
    # spectral element
    ufl_nodal_element = basix.ufl.element(ElementFamily.P, 
                                        cell_type, 
                                        poly_degree, 
                                        LagrangeVariant.gll_warped)

    # mesh: we "chop-off" by a small eps to avoid coordinate singularities
    domain = mesh.create_rectangle(MPI.COMM_WORLD, 
                                [[eps, eps], [1-eps, 1-eps]], 
                                [N_x, N_y],
                                mesh.CellType.quadrilateral)
	
    tdim = domain.topology.dim
    fdim = tdim - 1
    domain.topology.create_connectivity(fdim, tdim)

    #Function space: needs to be mixed since we're solving a coupled PDE system
    ME_element = basix.ufl.mixed_element([ufl_nodal_element, ufl_nodal_element])
    ME = fem.functionspace(domain, ME_element)
    #Trial and test functions
    v_test_gr, v_test_em = ufl.TestFunctions(ME)
    psi_gr, psi_em = ufl.TrialFunctions(ME)

    # Spatial coordinate
    x = ufl.SpatialCoordinate(domain)

    # create the quadrature rule and measure
    metadata = {
        "quadrature_rule": "GLL", 
        "quadrature_degree": 2 * poly_degree - 1
    }
    dx_meas = ufl.dx(domain=domain, metadata=metadata)

    # Getting back the objects:
    with open(f'eq_A.pk', 'rb') as f:  # Python 3: open(..., 'wb')
        EQN_kerr_A = pickle.load(f)
    with open(f'eq_B.pk', 'rb') as f:  # Python 3: open(..., 'wb')
        EQN_kerr_B = pickle.load(f)   

    # Extract into mass, stiffness...
    M_eq, C_eq, K_eq = extract_qep_coeffs(EQN_kerr_A)
    M_eq_b, C_eq_b, K_eq_b = extract_qep_coeffs(EQN_kerr_B)

    #Map for the UFL converter
    ufl_mapping = {
        'sigma': x[0],
        'x_coord': x[1],
        'M': M_mass_num,
        'lambda_': lambda_param,         
        'psi': psi_gr,
        'Q': Q_ch,
        'm': 0,
        'psi_b': psi_em,
        'a': a_rot_num,
    }

    # Translate the SymPy expressions directly into UFL operator expressions
    M_op = sympy_to_ufl(M_eq, ufl_mapping)
    C_op = sympy_to_ufl(C_eq, ufl_mapping)
    K_op = sympy_to_ufl(K_eq, ufl_mapping)

    M_op_b = sympy_to_ufl(M_eq_b, ufl_mapping)
    C_op_b = sympy_to_ufl(C_eq_b, ufl_mapping)
    K_op_b = sympy_to_ufl(K_eq_b, ufl_mapping)

    # Build the weak forms by taking the inner product with the test functions
    M_form_gr = ufl.inner(M_op, v_test_gr)*dx_meas
    M_form_em = ufl.inner(M_op_b, v_test_em)*dx_meas
    
    C_form_gr = ufl.inner(C_op, v_test_gr)*dx_meas
    C_form_em = ufl.inner(C_op_b, v_test_em)*dx_meas

    K_form_gr = ufl.inner(K_op, v_test_gr)*dx_meas
    K_form_em = ufl.inner(K_op_b, v_test_em)*dx_meas
 
    jit_options = {
    "cffi_extra_compile_args": ["-O1", "-fno-inline"]}
    
    M_1 = assemble_matrix(fem.form(M_form_gr+M_form_em, jit_options = jit_options))
    M_1.assemble()
    M_mat = M_1

    C_1 = assemble_matrix(fem.form(C_form_gr+C_form_em, jit_options = jit_options))
    C_1.assemble()
    C_mat = C_1

    K_1 = assemble_matrix(fem.form(K_form_gr+K_form_em, jit_options = jit_options))
    K_1.assemble()
    K_mat = K_1

    A = [ ]

    A.append(K_mat)
    A.append(C_mat)
    A.append(M_mat)

    return A, ME


###########################################
############   INITIALIZE
###########################################

phys_params = [0.5, 0, 0]

poly_degree = 18
eps = 2.5e-7
debug_mode = 0

A, ME = assembler(phys_params, poly_degree, eps, debug_mode = 0)

for i, mat in enumerate(A):  # PEP matrices
    viewer = PETSc.Viewer().createASCII("/dev/null")
    arr = mat.convert("dense").getDenseArray()
    print(f"A_{i}: max={np.max(np.abs(arr)):.2e}, "
        f"has_nan={np.any(np.isnan(arr))}, "
        f"has_inf={np.any(np.isinf(arr))}")
    
#PARAMETERS FOR SLEPC
target =  0.85
nev = 12
tol = 1e-7
max_it = 2000
pep_params = [nev, tol, max_it]

all_eigvals, efuns_list_gr, efuns_list_em, error_list = eig_solver(A,ME,target,pep_params)        

print("\n--- Sorted eigenvalues ---")
print(f"Found {len(all_eigvals)} physical eigenpairs after sorting.")
for i, eigval in enumerate(all_eigvals):
    print(f"QNM {i}: {eigval.real:.6e} + {eigval.imag:.6e}i, error {error_list[i]}")

dokken · April 15, 2026, 9:29pm

P=80 is way higher than what I would have ever expected to use in DOLFINx
However, the following form is possible to compile on my laptop

"""Use tensor product elements to test high order assembly"""

from mpi4py import MPI
import basix.ufl
import ufl
import dolfinx
import numpy as np
P = 80
cell_type = basix.CellType.quadrilateral
element = basix.ufl.wrap_element(
    basix.create_tp_element(basix.ElementFamily.P, cell_type, P, basix.LagrangeVariant.gll_warped)
)
c_el = basix.ufl.blocked_element(
    basix.ufl.wrap_element(
        basix.create_tp_element(basix.ElementFamily.P, cell_type, 1, basix.LagrangeVariant.gll_warped)
    ), (2,)
)

eps = 2.5e-7
nodes = np.array([[eps, eps],[1-eps, eps],[eps, 1-eps],[1-eps, 1-eps]], dtype=np.float64)
connectivity = np.array([[0, 1, 2, 3]], dtype=np.int64)
mesh = dolfinx.mesh.create_mesh(MPI.COMM_WORLD, cells=connectivity, x=nodes, e=c_el)
print("MESH CREATED")
V = dolfinx.fem.functionspace(mesh, element)

v = ufl.TestFunction(V)
u = ufl.TrialFunction(V)
a = ufl.inner(u, v) * ufl.dx
print("PREFORM")
form = dolfinx.fem.form(a, form_compiler_options={"sum_factorization": True})
print("POSTFORM")
J = dolfinx.fem.assemble_matrix(form)
print("ASSEMBLE")

I’m currently stuck at assembling this, as it has a loop of 82^2\cdot 81^4=289~446~152~004 sum operations in the local tensor,
Running with order 40 is fairly fast though, with form compilation taking 1.29 seconds and assembly 18.17 seconds
while the same operations took 15.88 and 18.4 seconds.

Order 50 compilation takes 3.17 seconds and assembly 97.6 seconds

mariom121 · April 16, 2026, 2:35pm

Thank you very much for your working example! Unfortunately, I’ve been trying to implement it over the day, and got some (sad ) conclusions:

The problem is the generation of a file named libffcx_forms_5b2b2baa72db4ea6cb4e41304c5e40c4338e795f.c / .o /. so, whose size skyrockets if the polynomial degree is large. If it ever gets above 2 GB (for me, that happens when p > 80), then the program crashes, returning this:

  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/dolfinx/jit.py", line 215, in ffcx_jit
    r = ffcx.codegeneration.jit.compile_forms([ufl_object], options=p_ffcx, **p_jit)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/jit.py", line 244, in compile_forms
    raise e
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/jit.py", line 224, in compile_forms
    impl = _compile_objects(
           ^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/jit.py", line 399, in _compile_objects
    ffibuilder.compile(tmpdir=cache_dir, verbose=True, debug=cffi_debug)
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/cffi/api.py", line 727, in compile
    return recompile(self, module_name, source, tmpdir=tmpdir,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/cffi/recompiler.py", line 1581, in recompile
    outputfilename = ffiplatform.compile('.', ext,
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/cffi/ffiplatform.py", line 20, in compile
    outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/cffi/ffiplatform.py", line 54, in _build
    raise VerificationError('%s: %s' % (e.__class__.__name__, e))
cffi.VerificationError: LinkError: command '/home/mario/.conda/envs/fenicsxv11-env/bin/gcc' failed with exit code 1

The key that makes tensor product elements work seems to be adding ,form_compiler_options={“sum_factorization”: True} . It makes the size of that file drops from 22 MB to approximately 50 KB (for a degree ~ 20), which made me hopeful that with a degree ~ 80 the size would plummet. However…
If any of my forms (to which I wish to apply form_compiler_options={“sum_factorization”: True} ) contain either:
- second derivatives of trial functions
- Two different test functions (as in a Mixed-Element setup)
- The ufl operator for boundary evaluation ufl.ds

I get the error:

    K_formed = fem.form(K_form, form_compiler_options={"sum_factorization": True})
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/dolfinx/fem/forms.py", line 449, in form
    return _create_form(form)
           ^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/dolfinx/fem/forms.py", line 441, in _create_form
    return _form(form)
           ^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/dolfinx/fem/forms.py", line 361, in _form
    ufcx_form, module, code = jit.ffcx_jit(
                              ^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/dolfinx/jit.py", line 60, in mpi_jit
    return local_jit(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/dolfinx/jit.py", line 215, in ffcx_jit
    r = ffcx.codegeneration.jit.compile_forms([ufl_object], options=p_ffcx, **p_jit)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/jit.py", line 244, in compile_forms
    raise e
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/jit.py", line 224, in compile_forms
    impl = _compile_objects(
           ^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/jit.py", line 349, in _compile_objects
    _, code_body = ffcx.compiler.compile_ufl_objects(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/compiler.py", line 118, in compile_ufl_objects
    code = generate_code(ir, options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/codegeneration.py", line 49, in generate_code
    integral_generator(integral_ir, domain, options)
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/C/integrals.py", line 42, in generator
    parts = ig.generate(domain)
            ^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/integral_generator.py", line 167, in generate
    all_preparts += self.generate_piecewise_partition(rule, cell)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/integral_generator.py", line 309, in generate_piecewise_partition
    return self.generate_partition(arraysymbol, F, "piecewise", None, None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/integral_generator.py", line 337, in generate_partition
    vdef = self.backend.definitions.get(mt, tabledata, quadrature_rule, vaccess)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/definitions.py", line 111, in get
    return handler(mt, tabledata, quadrature_rule, access)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/definitions.py", line 200, in _define_coordinate_dofs_lincomb
    FE, tables = self.access.table_access(tabledata, self.entity_type, mt.restriction, iq, ic)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/access.py", line 468, in table_access
    iq_i = quadrature_index.local_index(i)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mario/.conda/envs/fenicsxv11-env/lib/python3.12/site-packages/ffcx/codegeneration/lnodes.py", line 387, in local_index
    assert idx < len(self.symbols)
           ^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

In light of these: do you happen to know any workaround so that, either I avoid completely saving the “long-named” file, or anything similar to ,form_compiler_options={“sum_factorization”: True} that may work in a general setting?

I can probably get around the second derivative issues, by integrating by parts and modifying the math to get rid of boundary terms, but I’m tackling a coupled PDE system, so Mixed-Elements seems mandatory.

EDIT: I just learned you can build a coupled system without mixed elements, which I just tried and it works. Now I only need to get around the derivatives issue!

Once again, thank you very much for your help. I really wish there was a way to avoid so much information being saved into the cache. If you have any further ideas, I would be utterly glad to try them out

dokken · April 16, 2026, 6:04pm

There are several ways one can work around the issues of the mixed system.
For example:

Instead of mixed spaces, use a blocked element. You can see this for the coordinate element in my example:

Could you make some minimal example (similar to what i showed in the post above) for each of this forms. Then I can investigate each of them.

Topic		Replies	Views
Speeding up dolfinx code dolfinx performance	17	480	February 7, 2024
Prblem with the size of a matrix General	7	80	June 5, 2025
PETSc finite element assembly dies for large matrices Linear Algebra	0	954	November 28, 2019
Spack installation problem for Dolfinx 0.9.0 - compile options error? installation dolfinx , mixed-dimensional	15	193	November 26, 2025
Parallelisation, petsc4py, slepc4py, and dolfinx dolfinx	10	1988	June 3, 2022

FFCX compiler files size limit with petsc.assemble_matrix

Related topics