Disable internal parallelization

Hi everybody,

for a performance test I need to disable the automatic parallelization that occurs during this MWE in assembling and solving the large algebraic system

import fenics as fn
from dolfin import MPI

# Generate mesh
mesh = fn.UnitSquareMesh(MPI.comm_self, 800,800)

# Set up and solve variational problem
V = fn.FunctionSpace(mesh, 'CG', 1)

u_D = fn.Expression("0", degree=0)
bc = fn.DirichletBC(V, u_D, 'on_boundary')

a = fn.Constant(1.0)
f = fn.Constant(1.0)

u = fn.TrialFunction(V)
v = fn.TestFunction(V)

lhs = fn.dot(a*fn.grad(u),fn.grad(v))*fn.dx
rhs = f*v*fn.dx

sol = fn.Function(V)
fn.solve(lhs==rhs, sol, bc)

# Compute H1 norm
H1_norm = fn.norm(sol, 'H1')

print(H1_norm)

I ran it with mpirun -np 1 python3 script.py to try to restrict it to one processor, but it didn’t work.

I came across this post from 2012 regarding the same issue,

but have not found an answer yet.

I think I have to change my fn.solve parameters to be serial instead of parallel.
Can you help me out in doing this?

Thank you very much.

Best regards,
Cedric

What do you mean by “it didn’t work”? If you see more than one process in use when calling mpirun -np 1 then there’s something very wrong with your MPI build.

Perhaps you’re running an archaic version of dolfin which had rudimentary support for concurrency? You can turn this off by setting the environment variable OMP_NUM_THREADS=1.

You are right I only see one process when running the terminal command ‘top’.

By ‘didn’t work’ I mean, that my %CPU using the terminal command ‘top’ still spikes up to ~740% which tells me, that the computation has been distributed on all 8 CPUs of my laptop.

I am using dolfin 2019.1.

Unfortunately setting the threads to 1 with OMP_NUM_THREADS=1 did not change this behavior and the computation was still distributed over (all) CPUs.

Evidently there’s some eccentricity offered by your laptop / MPI build. This is not expected, nor can I reproduce it on any of my machines.

Thank you very much for your reply and for testing it on several machines.

Is this MPI build coupled to my dolfin installation? Could moving over from legacy fenics to fenicsx fix this issue?

Could moving over from legacy fenics to fenicsx fix this issue

very doubtful. I’d be inclined to write a basic test which just calls mpi4py and prints something based on COMM_SELF and see if you observe the same issue. If that works as you expect, then you could try petsc4py.PETSc.Sys.Print using COMM_SELF and again seeing if anything untoward is going on.