Hey everyone,
I have a case where I want to solve a PDE with many (hundreds or thousands) of parameter variations. In my case, I can’t use MPI to parallelise the evaluation (a longer story), so I thought of using the multiprocessing functionalities of python. The idea is to use a pool of processes where each process creates its own copy of the mesh, applies the parameters, defines the variational problem and solves it individually. In practice, this only works with some solvers though and I can’t explain why.
To give an example, I wrote this short script that reproduces the behaviour. It solves an elasticity problem and the modulus of elasticity is varied. The exact problem that is solved is irrelevant here (and actually the presented example evaluates pointless quantities).
from dolfin import *
import multiprocessing
import numpy as np
# bottom boundary for dirichlet bc
def bottom(x, on_boundary):
return near(x[2],0) and on_boundary
# top boundary for dirichlet BC
def top(x, on_boundary):
return near(x[2], 0) and on_boundary
# definitions of stress and strain
def eps(v):
return sym(grad(v))
def sigma(v, mu, lmbda):
return lmbda*tr(eps(v))*Identity(3) + 3*mu*eps(v)
def calc_disp(modulus):
# create mesh of unit cube
mesh = UnitCubeMesh(15,15,15)
# define function spaces
V = VectorFunctionSpace(mesh, "CG", 1)
v = TestFunction(V)
du = TrialFunction(V)
u = Function(V)
# Lamé constants
nu = 0.33
mu = modulus/(2*(1+nu))
lmbda = modulus/((1-2*nu)*(1+nu))
# variational problem
psi = inner(sigma(du, mu, lmbda), eps(v)) * dx
aM = lhs(psi)
LM = rhs(psi)
# BCs
delta = 1e-1
bcs = [DirichletBC(V, Constant((0.,0.,0.)), bottom),
DirichletBC(V.sub(2), Constant(delta), top)]
# solve the variational problem
solve( aM == LM, u, bcs) # hangs
# solve( aM == LM, u, bcs, solver_parameters={"linear_solver": "bicgstab"}) # doesn't hang
# return the x-displacement at a top corner
return u([1.,1.,1.])[0]
# starting a pool with nproc workers
nproc = 8
pool = multiprocessing.Pool(nproc)
# moduli to loop over
moduli = np.linspace(1,10,nproc*20)
# loop over all entries
pool.map(calc_disp, iter(moduli))
When I run this script, it starts working fine and solves a number of problems effortlessly. But then, it suddenly hangs. I can see with the ‘top’ command that there is no more CPU usage from these processes.
Interestingly, switching from the default solver to bicgstab (see the line that’s commented out in the script) solves this problem and the script runs through fine. I was also able to verify that if the processes start hanging, they hang indeed in the “solve” - line.
Also, the problem occurs only if the meshes are sufficiently large. If the mesh has a size of 5x5x5 instead of 15x15x15, there is no hanging.
For now, I can work with bicgstab. However, I’d be glad if anyone has an idea to avoid this behaviour with other solvers. I am aware that this is not quite the intended way to parallelise fenics, but it would be cool if something like this worked, especially if one has many parameter variations to go through.
Thanks for your help in advance,
Andreas