Speed up repeated evaluation, derivative (and assembly) of form

vincent_scholz · March 1, 2023, 9:45am

Hello everyone,

I’d like to use FEniCS to calculate an integral over a (discretized) domain. The integral in question is the weak form of a PDE (e.g. a diffusion PDE for now), but I do not want to solve a finite element problem. I, instead, want to plug in (discretized) functions for what is usually a Test- and TrialFunction and calculate a single scalar value (i.e. residual) using the weak form. To reemphasize, I do not need a solver, simply the FEniCS function “assembly” to evaluate the integrals in the weak form. Also, I need the derivatives of the weak form w.r.t. the function I plug in.

Here is my problem: I need to evaluate this integral in the order of 10^5 - 10^6ish times, and this evaluation is the bottle neck of my project. I guess my implementation is suboptimal and I have no experience what so ever with speeding up things in FEniCS. I have a minimal working example attached.

Please consider the following two cases:

The form of my integral remains the same, but I need to plug in new numbers for all my (discretized) input functions,
The form of the weak form remains the same and some of the input functions remain constant, while I need to plug in new numbers for other input functions.

Is there any thing that can be done to speed up this process?

Thanks in advance for your help!
Best, Vincent

Here is the minimal working example:

import fenics as df
import numpy as np

#%% Setup
# Basic setup, where there may be multiple function spaces
mesh = df.UnitSquareMesh(8, 8)
V = df.FunctionSpace(mesh, 'CG', 1)
Vc = df.FunctionSpace(mesh, 'DG', 0)

# create some functions for my formula
u = df.Function(V)
v = df.Function(V)
k = df.Function(Vc)  # <-- (potentially) different function space

# initialize some random values
v.vector().set_local(np.random.rand(V.dim()))
k.vector().set_local(np.random.rand(Vc.dim()))

# some arbitrary load
x = df.SpatialCoordinate(mesh)
f_load = 1 * x[0] + 2 * x[1]

#%% Part I have to run multiple times
some_large_number = 50  # <-- This is usually about 10^5

for i in range(some_large_number):
    # calculate new values for my functions (here: just random)
    # case a) some functions remain constant (e.g. v, k), and some change (e.g. u), as shown in this minimal example
    # case b) all functions (u, v, k) change
    u.vector().set_local(np.random.rand(V.dim()))

    # evaluate my formula res(u, v, k), which returns a scalar (!) residual
    # >>> first call: 3e-4 s, successive calls: 2e-4 s
    res_form = k * df.inner(df.grad(u), df.grad(v)) * df.dx - f_load * v * df.dx
    # >>> first call: 1e-3 s, successive calls: 4e-4 s
    res = df.assemble(res_form)

    # calculate the respective derivatives (here: just u for clarity)
    # >>> first call: 1e-3 s, successive calls: 7e-4 s
    dres__du = np.asarray(df.assemble(df.derivative(res_form, u)))

dokken · March 1, 2023, 1:52pm

vincent_scholz:

for i in range(some_large_number):
    # calculate new values for my functions (here: just random)
    # case a) some functions remain constant (e.g. v, k), and some change (e.g. u), as shown in this minimal example
    # case b) all functions (u, v, k) change
    u.vector().set_local(np.random.rand(V.dim()))

    # evaluate my formula res(u, v, k), which returns a scalar (!) residual
    # >>> first call: 3e-4 s, successive calls: 2e-4 s
    res_form = k * df.inner(df.grad(u), df.grad(v)) * df.dx - f_load * v * df.dx
    # >>> first call: 1e-3 s, successive calls: 4e-4 s
    res = df.assemble(res_form)

    # calculate the respective derivatives (here: just u for clarity)
    # >>> first call: 1e-3 s, successive calls: 7e-4 s
    dres__du = np.asarray(df.assemble(df.derivative(res_form, u)))

This could be rewritten as:


res_form = k * df.inner(df.grad(u), df.grad(v)) * df.dx - f_load * v * df.dx
dres_du = df.derivative(res_form, u)
for i in range(some_large_number):
    u.vector().set_local(np.random.rand(V.dim()))
    res = df.assemble(res_form)
    dres__du = np.asarray(df.assemble(dres_du))

A second thing that could speed up your code is to move to DOLFINx, as there the compiled form can be stored outside of the loop, saving you some time looking up cached code.

vincent_scholz · March 21, 2023, 2:44pm

Thank you very much for your help, Dr. Dokken (@dokken).

Short summary of the speed up for the problem per call (for a large number of calls) using the tips provided:

a) From moving the calculation of the forms out of the loop:

form for residual: 2e-4 s >>> 0 s
form for derivative of residual: ??? >>> 0 s (I didn’t time this explicitly)

b) from moving to FEniCSX:

assembly of residual: 4e-4 s >>> 3e-5 s
assembly of derivative: 7e-4 s >>> 6e-5 s

For moving to FEniCSX, here is a minimal working example I used:

import dolfinx as dfx
from mpi4py import MPI
import ufl
import numpy as np

#%% Setup
# Basic setup, where there may be multiple function spaces
mesh = dfx.mesh.create_unit_square(MPI.COMM_WORLD, 8, 8)
V = dfx.fem.FunctionSpace(mesh, ('CG', 1))
Vc = dfx.fem.FunctionSpace(mesh, ('DG', 0))

dim_V = V.dofmap.index_map.size_global * V.dofmap.index_map_bs
dim_Vc = Vc.dofmap.index_map.size_global * Vc.dofmap.index_map_bs

# create some functions for my formula
u = dfx.fem.Function(V)
v = dfx.fem.Function(V)
k = dfx.fem.Function(Vc)  # <-- (potentially) different function space

# initialize some random values
v.x.array[:] = np.random.rand(dim_V)
k.x.array[:] = np.random.rand(dim_Vc)

# some arbitrary load
x = ufl.SpatialCoordinate(mesh)
f_load = 1.0 * x[0] + 2.0 * x[1]

#%% Part I have to run multiple times
some_large_number = 50  # <-- This is usually about 10^5

# evaluate my formula res(u, v, k), which returns a scalar (!) residual
# >>> first call: 3e-4 s, successive calls: 2e-4 s
res_ufl = k * ufl.inner(ufl.grad(u), ufl.grad(v)) * ufl.dx - f_load * v * ufl.dx
res_form = dfx.fem.form(res_ufl)

# derive form
dres__du_form = dfx.fem.form(ufl.derivative(res_ufl, u))

for i in range(some_large_number):
    # calculate new values for my functions (here: just random)
    # case a) some functions remain constant (e.g. v, k), and some change (e.g. u), as shown in this minimal example
    # case b) all functions (u, v, k) change
    u.x.array[:] = np.random.rand(dim_V)

    # >>> first call: 5e-5 s, successive calls: 3e-5 s
    res = dfx.fem.assemble_scalar(res_form)  #, op=MPI.SUM

    # calculate the respective derivatives (here: just u for clarity)
    # >>> first call: 2e-4 s, successive calls: 6e-5 s
    dres__du = dfx.fem.assemble_vector(dres__du_form).array

Topic		Replies	Views
Computational Time of Integrals	3	1037	September 4, 2019
How to improve performance on assemble problem	2	399	April 20, 2021
Assemble is very slow in FEniCS legacy and dolfin legacy variational formulation	1	56	November 16, 2024
Compute derivatives in FEniCSX	1	1080	June 1, 2021
Efficient way to sum up fem.function's General	5	158	April 22, 2024

Speed up repeated evaluation, derivative (and assembly) of form

Related topics