Parallelizing Legacy Fenics Code

Here I’m using FEniCS 2019.1.0. For the MWE pasted below, the performance I see upon executing the program using mpirun -np 4 python3 program.py is actually worse than when I run the program in serially.

What do I have to change to see a meaningful speedup?

from __future__ import print_function
import fenics as fe
comm = fe.MPI.comm_world
rank = fe.MPI.rank(comm)

T = 2.0            # final time
num_steps = 1000     # number of time steps
dt = T / num_steps # time step size

# Create mesh and define function space
nx = ny = 30
mesh = fe.RectangleMesh(comm, fe.Point(-2, -2), fe.Point(2, 2), nx, ny)
V = fe.FunctionSpace(mesh, 'P', 1)

# Define boundary condition
def boundary(x, on_boundary):
    return on_boundary

bc = fe.DirichletBC(V, fe.Constant(0), boundary)

# Define initial value
u_0 = fe.Expression('exp(-a*pow(x[0], 2) - a*pow(x[1], 2))',
                 degree=2, a=5)
u_n = fe.interpolate(u_0, V)

# Define variational problem
u = fe.TrialFunction(V)
v = fe.TestFunction(V)
f = fe.Constant(0)

bilin = u*v*fe.dx + dt*fe.dot(fe.grad(u), fe.grad(v))*fe.dx
lin = (u_n + dt*f)*v*fe.dx

A = fe.assemble(bilin)
b = fe.assemble(lin)

solver = fe.KrylovSolver("cg", "hypre_amg")
solver.set_operator(A)

outfile = fe.XDMFFile(comm, "solution.xdmf")
outfile.parameters["flush_output"] = True
outfile.parameters["functions_share_mesh"] = True
outfile.parameters["rewrite_function_mesh"] = False

# Time-stepping
u = fe.Function(V)
t = 0
for n in range(num_steps):

    # Update current time
    t += dt

    # Compute solution
    b = fe.assemble(lin)
    bc.apply(A, b)
    solver.solve(u.vector(), b)

    # Update previous solution
    u_n.assign(u)

    # Save to file 
    outfile.write(u_n, t)

Your problem is so small

that running in parallel with a partitioned mesh doesn’t make sense.

This has also been discussed at for instance: Parallel slower runtime - #2 by dokken

Especially, since your problem is this small, I would probably use a direct solver rather than “cg” and “hypre_amg”.