PETSc NonlinearProblem running slow

shrunkjungle · November 27, 2023, 11:27am

Hi, I have adapted the code from demo_elastodynamics.py in order to make a similar solver in dolfinx. However, it runs much slower than the legacy version.

In the current state of the code (see below), it takes 40 secs to run for a mesh with 500 cells and 50 timesteps. This is worrying because A) it is much slower than the legacy version of effectively the same code and B) eventually I want to multiply both of these parameters by 100.

I am new to the use of FEniCS/dolfinx but I assume a better problem-solver configuration is the solution? Any help on this would be appreciated.

import numpy as np
import matplotlib.pyplot as plt
from mpi4py import MPI
from petsc4py import PETSc
from dolfinx import fem, mesh,default_scalar_type
from dolfinx.io import XDMFFile
from dolfinx import *
import ufl
import dolfinx.fem.petsc, dolfinx.nls.petsc
import time
start_time = time.time()

# Create mesh and function space
L = 1.0  # Length of the rod
nx, ny, nz = 20, 5, 5  # Number of elements in x, y, and z directions
domain = mesh.create_box(MPI.COMM_WORLD, [np.array([0, 0, 0]), np.array([L, 0.1, 0.1])], [nx, ny, nz], cell_type=mesh.CellType.hexahedron)
V = fem.VectorFunctionSpace(domain, ("CG", 1))

fdim = domain.topology.dim - 1

def clamped_boundary(x):
    return np.isclose(x[0], 0) #|np.isclose(x[0], L)

boundary_facets = mesh.locate_entities_boundary(domain, fdim, clamped_boundary)

u_D = np.array([0, 0, 0], dtype=default_scalar_type)
bc = fem.dirichletbc(u_D, fem.locate_dofs_topological(V, fdim, boundary_facets), V)

# Elasticity constants
E = 1.0e6
nu = 0.3
mu = E / (2.0 * (1 + nu))
lmbda = E * nu / ((1.0 + nu) * (1.0 - 2.0 * nu))

# Time integration parameters
am = 0.2
af = 0.4
g = 0.5 + af - am
alpha_m = fem.Constant(domain, am)
alpha_f = fem.Constant(domain, af)
gamma = fem.Constant(domain, g)
beta = fem.Constant(domain, (g + 0.5)**2 / 4.0)

rho = 1.0
eta_m = 0.0
eta_k = 0.0

T = 1.0
Nsteps = 50
dt = fem.Constant(domain, T / Nsteps)

# Functions and fields
u_ = ufl.TestFunction(V)
u = fem.Function(V, name='Displacement')
u_old = fem.Function(V)
v_old = fem.Function(V)
a_old = fem.Function(V)

# External load
B = fem.Constant(domain, PETSc.ScalarType((0.0, 0.0, -1.0)))  # Applied load in the z-direction

# Measures
dx = ufl.Measure("dx", domain=domain)

# GA method functions
def sigma(r):
    return lmbda*ufl.nabla_div(r)*ufl.Identity(len(r)) + 2*mu*ufl.sym(ufl.grad(r))
    
def m(u,u_):
    return rho*ufl.inner(u,u_)*dx

def k(u,u_):
    return ufl.inner(sigma(u),ufl.grad(u_))*dx

def c(u,u_):
    return eta_m*m(u,u_) + eta_k*k(u,u_)

#update variables
def update_a(u, u_old, v_old, a_old, ufl_=True):
    if ufl_:
        dt_ = dt
        beta_ = beta
    else:
        dt_ = dt
        beta_ = beta.value
    return (u-u_old-dt_*v_old)/beta_/dt_**2 - (1-2*beta_)/2/beta_*a_old


def update_v(a,u_old,v_old,a_old,ufl_=True):
    if ufl_:
        dt_ = dt
        gamma_ = gamma
    else:
        dt_ = dt
        gamma_ = gamma.value
    return v_old + dt_*((1-gamma_)*a_old + gamma_*a)

def update_fields(u,u_old,v_old,a_old):
    u_vec,u0_vec = u.x.array[:],u_old.x.array[:]
    v0_vec,a0_vec = v_old.x.array[:],a_old.x.array[:]
    
    a_vec = update_a(u_vec,u0_vec,v0_vec,a0_vec,ufl_=False)
    v_vec = update_v(a_vec,u0_vec,v0_vec,a0_vec,ufl_=False)
    v_old.x.array[:] = v_vec
    a_old.x.array[:] = a_vec
    u_old.x.array[:] = u_vec
    
def avg(x_old,x_new,alpha):
    return alpha*x_old + (1-alpha)*x_new

a_new = update_a(u,u_old,v_old,a_old,ufl_=True)
v_new = update_v(a_new,u_old,v_old,a_old,ufl_=True)

# Problem setup
res =  m(avg(a_old,a_new,alpha_m),u_) + c(avg(v_old,v_new,alpha_f),u_) + k(avg(u_old,u,alpha_f),u_) - rho*ufl.inner(B,u_)*dx  #residual to solve
problem = fem.petsc.NonlinearProblem(res,u,[bc])

# Solver setup
solver = nls.petsc.NewtonSolver(MPI.COMM_WORLD, problem)
solver.atol = 1e-8
solver.rtol = 1e-8
solver.convergence_criterion = "incremental"

# Quantities to track
times = np.linspace(0, T, Nsteps + 1)

# Create file
with XDMFFile(domain.comm, "rod_dbc0.xdmf", "w") as xdmf_file:
    xdmf_file.write_mesh(domain)
    xdmf_file.write_function(u, 0)

# Write function at each time step
for i in range(Nsteps):
    t = times[i + 1]
    print("Time = %s" % t )
    num_its, converged = solver.solve(u)
    assert converged
    u.x.scatter_forward()

    # Update old fields with new quantities
    update_fields(u, u_old, v_old, a_old)

    # Write the function to the XDMF file
    xdmf_file.write_function(u, t)

print("--- %s seconds runtime ---" % (time.time() - start_time))

francesco-ballarin · November 27, 2023, 11:44am

You should add also the code of the legacy implementation you adapted from the the old demo.

I am assuming you are not running the file you linked, because

in demo_elastodynamics:

E  = 1000.0
nu = 0.3

in your snippet:

E = 1.0e6
nu = 0.3

Comparing the runtimes which material properties so different seems unfair to me.

shrunkjungle · November 27, 2023, 11:53am

The stress tensor (sigma) is linear in E so I don’t think the magnitude of this constant should affect the runtime in the significant way I am looking for.

dokken · November 27, 2023, 12:00pm

You have also not specified any solver options and these might differ between DOLFIN (legacy) and DOLFINx.

Note that almost all of the time is spent inside your function update_fields.
This is because of how you create v_vec and u_vec, which are not sensible objects to assign to a function.
Instead of working on the array data, you are assigning ufl objects to a numpy array.

There are various things here that you need to carefully address here (specifically mapping all constants to floats).
The following code takes 2.4 seconds to run:

import numpy as np
import matplotlib.pyplot as plt
from mpi4py import MPI
from petsc4py import PETSc
from dolfinx import fem, mesh, default_scalar_type
from dolfinx.io import XDMFFile
from dolfinx import *
import ufl
import dolfinx.fem.petsc
import dolfinx.nls.petsc
import time
start_time = time.time()

# Create mesh and function space
L = 1.0  # Length of the rod
nx, ny, nz = 20, 5, 5  # Number of elements in x, y, and z directions
domain = mesh.create_box(MPI.COMM_WORLD, [np.array([0, 0, 0]), np.array(
    [L, 0.1, 0.1])], [nx, ny, nz], cell_type=mesh.CellType.hexahedron)
V = fem.VectorFunctionSpace(domain, ("CG", 1))

fdim = domain.topology.dim - 1


def clamped_boundary(x):
    return np.isclose(x[0], 0)  # |np.isclose(x[0], L)


boundary_facets = mesh.locate_entities_boundary(domain, fdim, clamped_boundary)

u_D = np.array([0, 0, 0], dtype=default_scalar_type)
bc = fem.dirichletbc(u_D, fem.locate_dofs_topological(
    V, fdim, boundary_facets), V)

# Elasticity constants
E = 1.0e6
nu = 0.3
mu = E / (2.0 * (1 + nu))
lmbda = E * nu / ((1.0 + nu) * (1.0 - 2.0 * nu))

# Time integration parameters
am = 0.2
af = 0.4
g = 0.5 + af - am
alpha_m = fem.Constant(domain, am)
alpha_f = fem.Constant(domain, af)
gamma = fem.Constant(domain, g)
beta = fem.Constant(domain, (g + 0.5)**2 / 4.0)

rho = 1.0
eta_m = 0.0
eta_k = 0.0

T = 1.0
Nsteps = 50
dt = fem.Constant(domain, T / Nsteps)

# Functions and fields
u_ = ufl.TestFunction(V)
u = fem.Function(V, name='Displacement')
u_old = fem.Function(V)
v_old = fem.Function(V)
a_old = fem.Function(V)

# External load
B = fem.Constant(domain, PETSc.ScalarType((0.0, 0.0, -1.0))
                 )  # Applied load in the z-direction

# Measures
dx = ufl.Measure("dx", domain=domain)

# GA method functions


def sigma(r):
    return lmbda*ufl.nabla_div(r)*ufl.Identity(len(r)) + 2*mu*ufl.sym(ufl.grad(r))


def m(u, u_):
    return rho*ufl.inner(u, u_)*dx


def k(u, u_):
    return ufl.inner(sigma(u), ufl.grad(u_))*dx


def c(u, u_):
    return eta_m*m(u, u_) + eta_k*k(u, u_)

# update variables


def update_a(u, u_old, v_old, a_old, ufl_=True):
    if ufl_:
        dt_ = dt
        beta_ = beta
        return (u-u_old-dt_*v_old)/beta_/dt_**2 - (1-2*beta_)/2/beta_*a_old

    else:
        dt_ = dt
        beta_ = beta.value
        return (u-u_old-float(dt_)*v_old)/float(beta_)/float(dt_)**2 - (1-2*float(beta_))/2/float(beta_)*a_old


def update_v(a, u_old, v_old, a_old, ufl_=True):
    if ufl_:
        return v_old + dt*((1-gamma)*a_old + gamma*a)
    else:
        return v_old + float(dt) * ((1 - float(gamma)) * a_old + float(gamma)*a)


def update_fields(u, u_old, v_old, a_old):
    u_vec, u0_vec = u.x.array[:], u_old.x.array[:]
    v0_vec, a0_vec = v_old.x.array[:], a_old.x.array[:]

    a_vec = update_a(u_vec, u0_vec, v0_vec, a0_vec, ufl_=False)
    v_vec = update_v(a_vec, u0_vec, v0_vec, a0_vec, ufl_=False)

    v_old.x.array[:] = v_vec
    a_old.x.array[:] = a_vec
    u_old.x.array[:] = u_vec


def avg(x_old, x_new, alpha):
    return alpha*x_old + (1-alpha)*x_new


a_new = update_a(u, u_old, v_old, a_old, ufl_=True)
v_new = update_v(a_new, u_old, v_old, a_old, ufl_=True)

# Problem setup
res = m(avg(a_old, a_new, alpha_m), u_) + c(avg(v_old, v_new, alpha_f), u_) + \
    k(avg(u_old, u, alpha_f), u_) - rho * \
    ufl.inner(B, u_)*dx  # residual to solve
problem = fem.petsc.NonlinearProblem(res, u, [bc])

# Solver setup
solver = nls.petsc.NewtonSolver(MPI.COMM_WORLD, problem)
solver.atol = 1e-8
solver.rtol = 1e-8
solver.convergence_criterion = "incremental"

# Quantities to track
times = np.linspace(0, T, Nsteps + 1)

# Create file
with XDMFFile(domain.comm, "rod_dbc0.xdmf", "w") as xdmf_file:
    xdmf_file.write_mesh(domain)
    xdmf_file.write_function(u, 0)

solve_time = 0
# Write function at each time step
for i in range(Nsteps):
    t = times[i + 1]
    print("Time = %s" % t)

    s = time.perf_counter()
    num_its, converged = solver.solve(u)
    e = time.perf_counter()
    solve_time += e-s
    assert converged
    u.x.scatter_forward()

    # Update old fields with new quantities
    update_fields(u, u_old, v_old, a_old)

    # Write the function to the XDMF file
    xdmf_file.write_function(u, t)

print("--- %s seconds runtime ---" % (time.time() - start_time))
print(solve_time)

Also note that the mesh uses hexahedrons in your dolfinx code, while the legacy code uses tetrahedrons

shrunkjungle · November 27, 2023, 3:10pm

Thanks for this @dokken - this is really helpful. I will look into the solver options.

Just to clarify: with the 'fem.Constant’s adjusted to floats now update functions are working with numpy arrays, which is a more sensible approach?

Also, my equation can be split into a bilinear and linear form - will petsc.LinearProblem be expected to give better performance?

dokken · November 27, 2023, 3:18pm

Yes. That is what you want to do. Working with ufl objects changes the array to an array of ufl objects.

It all boils down to what options you set for the krylov solver of the linearized problem, as it is equal to the problem formed in the Newton solver

shrunkjungle · November 27, 2023, 3:27pm

Understood. Thanks again for the help!

Topic		Replies	Views
Dolfinx seems much slower than dolfin in solving nonlinear mechanics dolfinx dolfinx	6	135	June 22, 2025
Explicit time solver General	7	514	January 16, 2024
Troubles setting up PETSc options with dolfinx 0.9.0 dolfinx petsc	0	53	April 25, 2025
Solving 3D Stokes problem is slower on dolfinx than dolfin dolfinx dolfinx , mpi , petsc	6	100	July 8, 2024
FFCx compilation speed decreased? General	11	576	September 12, 2023

PETSc NonlinearProblem running slow

Related topics