FFCx compilation speed decreased?

Hello,

I monitored a significant slowdown in FFCx compilation speed using latest nightly dolfinx vs. docker image from Oct. 2nd. Consider this nonlinear elasticity MWE:

#!/usr/bin/env python3
import time, sys
from mpi4py import MPI
from petsc4py import PETSc
import numpy as np
from dolfinx import fem, mesh
import ufl

start = time.time()

comm = MPI.COMM_WORLD

msh = mesh.create_box(comm, [np.array([0.0, 0.0, 0.0]),np.array([2.0, 1.0, 1.0])], [5, 5, 5],
               mesh.CellType.tetrahedron, mesh.GhostMode.none)

dim = msh.geometry.dim

P = ufl.VectorElement("CG", msh.ufl_cell(), 1)
V = fem.FunctionSpace(msh, P)

u = fem.Function(V)
du = ufl.TrialFunction(V)
var_u = ufl.TestFunction(V)

F = ufl.Identity(dim) + ufl.grad(u)
C = ufl.variable(F.T*F)

Ic   = ufl.tr(C)
IIc  = 0.5*(ufl.tr(C)**2. - ufl.tr(C*C))
IIIc = ufl.det(C)
Ic_bar   = IIIc**(-1./3.) * Ic
IIc_bar  = IIIc**(-2./3.) * IIc

# dev Mooney-Rivlin + vol Ogden
Psi = 1. * (Ic_bar - 3.) + 1. * (IIc_bar - 3.) + (1./4.) * (-2.*ufl.ln(ufl.sqrt(IIIc)) + ufl.sqrt(IIIc)**(2.) - 1.)
S = 2.*ufl.diff(Psi,C)

# inner virtual work
var_C = ufl.grad(var_u).T * F + F.T * ufl.grad(var_u)
dW_int = ufl.inner(S, 0.5*var_C)*ufl.dx

# jacobian - either via ufl's derivative or directly written as form
#jac = ufl.derivative(dW_int, u, du)

Cmat = 2.*ufl.diff(S,C)
i, j, k, l, m, n = ufl.indices(6)
Ctang = ufl.as_tensor(Cmat[i,j,k,l]*ufl.derivative(C, u, du)[k,l], (i,j))
jac = (ufl.inner(0.5*Ctang,0.5*var_C) + ufl.inner(S,ufl.derivative(0.5*var_C, u, du)))*ufl.dx

# assemble
r = fem.petsc.assemble_vector(fem.form(dW_int))
r.ghostUpdate(addv=PETSc.InsertMode.ADD, mode=PETSc.ScatterMode.REVERSE)

K = fem.petsc.assemble_matrix(fem.form(jac), [])
K.assemble()

if comm.rank == 0: # only proc 0 should print this
    print('Time needed: %.4f s (= %.2f min)' % ( time.time()-start, (time.time()-start)/60. ))
    sys.stdout.flush()

This takes 2.2 seconds with Oct. 2nd docker image and 21.0 seconds using latest (1 core).
Is there anything specific to set to speed up FFCx?

(Sorry that it’s a bit lengthy but I wanted to create sufficiently complex forms so that reasonable timings can be measured…)

Thanks!

Best,
Marc

1 Like

I see the same regression using the conda-forge packages. Definitely worth filing an issue.

dolfinx.__version__ = 0.5.1
Time needed: 1.1894 s (= 0.02 min)
dolfinx.__version__ = 0.6.0
Time needed: 17.6842 s (= 0.29 min)

DOLFINx can be stripped out of this, i.e.:


import ffcx
import numpy as np
import ufl


cell = ufl.tetrahedron
msh = ufl.Mesh(ufl.VectorElement("Lagrange", cell, 1))

P = ufl.VectorElement("CG", cell, 1)
V = ufl.FunctionSpace(msh, P)

u = ufl.Coefficient(V)
du = ufl.TrialFunction(V)
var_u = ufl.TestFunction(V)

F = ufl.Identity(len(u)) + ufl.grad(u)
C = ufl.variable(F.T*F)

Ic = ufl.tr(C)
IIc = 0.5*(ufl.tr(C)**2. - ufl.tr(C*C))
IIIc = ufl.det(C)
Ic_bar = IIIc**(-1./3.) * Ic
IIc_bar = IIIc**(-2./3.) * IIc

# dev Mooney-Rivlin + vol Ogden
Psi = 1. * (Ic_bar - 3.) + 1. * (IIc_bar - 3.) + (1./4.) * \
    (-2.*ufl.ln(ufl.sqrt(IIIc)) + ufl.sqrt(IIIc)**(2.) - 1.)
S = 2.*ufl.diff(Psi, C)

# inner virtual work
var_C = ufl.grad(var_u).T * F + F.T * ufl.grad(var_u)
dW_int = ufl.inner(S, 0.5*var_C)*ufl.dx

# jacobian - either via ufl's derivative or directly written as form
#jac = ufl.derivative(dW_int, u, du)

Cmat = 2.*ufl.diff(S, C)
i, j, k, l, m, n = ufl.indices(6)
Ctang = ufl.as_tensor(Cmat[i, j, k, l]*ufl.derivative(C, u, du)[k, l], (i, j))
jac = (ufl.inner(0.5*Ctang, 0.5*var_C) +
       ufl.inner(S, ufl.derivative(0.5*var_C, u, du)))*ufl.dx

forms = [dW_int, jac]

FFCx 0.5.0.post0

time python3 -m ffcx mwe.py 

real	0m1.111s
user	0m1.099s
sys	0m0.012s

FFCX v0.6.0.dev0

time python3 -m ffcx mwe5.py 

real	0m24.535s
user	0m22.797s
sys	0m1.728s
1 Like

@marchirschvogel I definitely support @nate’s comment of filing an issue. Feel free to file one to ffcx.

Thanks for reporting. Regression has now been fixed in Use BaseArgument.__eq__ in Argument by michalhabera · Pull Request #147 · FEniCS/ufl · GitHub and in UFL release 2023.1.1.post0 (see Release 2023.1.1.post0 · FEniCS/ufl · GitHub).

3 Likes

Great, thanks! Everything back to usual speed! Is there any way of suppressing the ffcx compilation output? Seems to be new that it now prints several lines to a page when compiling.

form_compiler_options={"verbosity":40} as input to dolfinx.fem.form should probably turn it to a higher level: ffcx/options.py at 763c6becffbccb4420fca9d4cd4a71ae2a6d97a6 · FEniCS/ffcx · GitHub

Could you show an example of what output you are seeing?

Using latest dolfinx docker image, I see the following output on the MWE I’ve posted:

INFO:root:running build_ext
INFO:root:building 'libffcx_forms_5e4c718488748ef8750053c0dd6f510b8168fe35' extension
INFO:root:x86_64-linux-gnu-gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/usr/local/lib/python3.10/dist-packages/ffcx/codegeneration -I/usr/include/python3.10 -c libffcx_forms_5e4c718488748ef8750053c0dd6f510b8168fe35.c -o ./libffcx_forms_5e4c718488748ef8750053c0dd6f510b8168fe35.o -O2
INFO:root:x86_64-linux-gnu-gcc -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 ./libffcx_forms_5e4c718488748ef8750053c0dd6f510b8168fe35.o -L/usr/lib/x86_64-linux-gnu -o ./libffcx_forms_5e4c718488748ef8750053c0dd6f510b8168fe35.cpython-310-x86_64-linux-gnu.so
INFO:root:running build_ext
INFO:root:building 'libffcx_forms_b93406e499dff507843cb5772247f3212eb1ac7d' extension
INFO:root:x86_64-linux-gnu-gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/usr/local/lib/python3.10/dist-packages/ffcx/codegeneration -I/usr/include/python3.10 -c libffcx_forms_b93406e499dff507843cb5772247f3212eb1ac7d.c -o ./libffcx_forms_b93406e499dff507843cb5772247f3212eb1ac7d.o -O2
INFO:root:x86_64-linux-gnu-gcc -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 ./libffcx_forms_b93406e499dff507843cb5772247f3212eb1ac7d.o -L/usr/lib/x86_64-linux-gnu -o ./libffcx_forms_b93406e499dff507843cb5772247f3212eb1ac7d.cpython-310-x86_64-linux-gnu.so
Time needed: 2.2447 s (= 0.04 min)

This is very strange. It seems to happen if you upgrade setuptools from 65.5.1 to 65.6.0 we get this output

Issue made at: FFCx logger leaking information with setuptools>65.5.1 · Issue #553 · FEniCS/ffcx · GitHub

I also have started getting these logs intermittently:

INFO:root:running build_ext
INFO:root:building 'libffcx_forms_1b2dbab4b21ad35dd15b8c346ccec06264618fd8' extension
INFO:root:x86_64-linux-gnu-gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/usr/local/lib/python3.10/dist-packages/ffcx/codegeneration -I/usr/include/python3.10 -c libffcx_forms_1b2dbab4b21ad35dd15b8c346ccec06264618fd8.c -o ./libffcx_forms_1b2dbab4b21ad35dd15b8c346ccec06264618fd8.o -O2 -g0
INFO:root:x86_64-linux-gnu-gcc -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 ./libffcx_forms_1b2dbab4b21ad35dd15b8c346ccec06264618fd8.o -L/usr/lib/x86_64-linux-gnu -o ./libffcx_forms_1b2dbab4b21ad35dd15b8c346ccec06264618fd8.cpython-310-x86_64-linux-gnu.so

It only started happening a couple of weeks ago and doesn’t get printed out all the time. Sometimes between every line, sometimes not at all, changing between kernel restarts.
Is there a fix to this? Other than sometimes timing out sometimes, it doesn’t change the results of the solver.

This was due to a change in logging in CFFI and has been fixed on the main branch: FFCx logger leaking information with setuptools>65.5.1 · Issue #553 · FEniCS/ffcx · GitHub