Collapsing mixed space takes unreasonably long


For various projects I have noticed that collapsing a mixed function space that includes a vector component takes unreasonably long (it is by orders of magnitude the longest executing line in the entire code). For simple 3D geometry with higher-order spaces, the execution-time becomes intractable.

Here a MWE illustrating the issue:

from dolfinx import mesh, fem
from basix.ufl import element, mixed_element
from mpi4py import MPI
import time

nx:int = 5 # Number of elements in x
ny:int = 5 # Number of elements in y
nz:int = 5 # Number of elements in z
Pu:int = 4 # Polynomial order velocity
Pp:int = 4 # Polynomial order pressure

# Create square quad mesh
start = time.time()
domain = mesh.create_unit_cube(MPI.COMM_WORLD, nx,ny,nz, mesh.CellType.hexahedron)
Ve = element("Lagrange", domain.basix_cell(), Pu, shape=(domain.geometry.dim,))
Qe = element("Lagrange", domain.basix_cell(), Pp)
W_el = mixed_element([Ve, Qe])
W = fem.functionspace(domain, W_el)
print(f"Meshing and creating mixed space: {time.time()-start}"); start = time.time()

V, WV_map = W.sub(0).collapse()
print(f"Collapsing V: {time.time()-start}"); start = time.time()

Q, WQ_map = W.sub(1).collapse()
print(f"Collapsing Q: {time.time()-start}")

with output:

  • Meshing and creating mixed space: 0.05801892280578613
  • Collapsing V: 4.216069936752319
  • Collapsing Q: 0.002038717269897461

It is striking that collapsing Q is 2000 times quicker than collapsing V.

It also scales strangely; for nx=ny=nz=10 (double the elements in all directions, a factor 8 in DOFs) I get:

  • Meshing and creating mixed space: 0.06880497932434082
  • Collapsing V: 161.0486958026886
  • Collapsing Q: 0.0062062740325927734

(Factor 3 for Q, but factor 40! for V. Now they differ by a factor 25000)

Any thoughts? I hope I am making a novice mistake…


1 Like

Thanks for reporting this. This seems like a regression.I’ve made an issue to keep track of this: Regression in collapsing blocked subspace · Issue #3282 · FEniCS/dolfinx · GitHub

Happy I could help. Great to see this is picked up so quickly!

Does that mean my best course of action is to be a little patient for a potential bugfix in a future minor release?

Yes, or run your code in parallel, as indicated by my follow-up comment in the issue. The improvement by using 2 processors over 1 was a 10x factor for one of the examples