How to solve in parallel for parametric study

One option is to use the MPI_COMM_SELF communicator instead of the default MPI_COMM_WORLD one (which results in e.g. parallel assembly/solve if your program is run with mpirun -np X). The following illustrates the idea; which parameter the process uses is determined by its rank in the global communicator

# search.py
from dolfin import *

def poisson(alpha, comm=mpi_comm_world()):
    mesh = UnitSquareMesh(comm, 256, 256)
    
    V = FunctionSpace(mesh, 'CG', 1)
    u, v = TrialFunction(V), TestFunction(V)
    bc = DirichletBC(V, Constant(0), 'on_boundary')

    a = inner(Constant(alpha)*grad(u), grad(v))*dx
    L = inner(Constant(1), v)*dx

    uh = Function(V)
    solve(a == L, uh, bc)
    info('alpha %g -> |uh|=%g' % (alpha, uh.vector().norm('l2')))
    File(comm, 'foo%g.pvd' % alpha) << uh
    
alphas = [3, 8, 9, 10]

assert len(alphas) >= MPI.size(mpi_comm_world())
# Get alpha based on global rank
my_alpha = alphas[MPI.rank(mpi_comm_world())]
poisson(my_alpha, mpi_comm_self())  # Do everything on THIS process
# run as `mpirun -np 4 python search.py`

An alternative is to use python’s multiprocessing module; in the above you replace everything after the assert statement with

import multiprocessing

pool = multiprocessing.Pool(processes=4)
pool.map(poisson, alphas)
pool.close()
# Run as python search.py

Timings on my machine (FEniCS 2017.2.0) are

  • 1.4s for time mpirun -np 1 python search.py (one alpha)
  • 1.9s for time mpirun -np 4 python search.py (all alphas, 4cpus)
  • 1.7s for time python search.py (all alphas, pool of 4 workers)
2 Likes