Gathering and Scattering during parallel runs

Hello,

I am trying to run FEniCS (2019.1) parallel that involves gathering solutions from all processes, post-process the full solution, and scatter them back to each process. I am using demo_possion as an example to start. I looked into dokken’ post but couldn’t quiet understand nor was I able to make it to work. Following is an example I am testing on, could anyone share some advice on this? Thank you!

import matplotlib.pyplot as plt
from dolfin import *
import numpy as np

fig = plt.figure(figsize=(6.4,4.8),dpi=200)
plot(mesh,linewidth=0.1)
fig.savefig('mesh_mpirank_'+str(mpi_rank)+'.png')
plt.close(fig)

V = FunctionSpace(mesh, "Lagrange", 1)
def boundary(x):
    return x[0] < DOLFIN_EPS or x[0] > 1.0 - DOLFIN_EPS
u0 = Constant(0.0)
bc = DirichletBC(V, u0, boundary)
u = TrialFunction(V)
v = TestFunction(V)
f = Expression("10*exp(-(pow(x[0] - 0.5, 2) + pow(x[1] - 0.5, 2)) / 0.02)", degree=2)
g = Expression("sin(5*x[0])", degree=2)
a = inner(grad(u), grad(v))*dx
L = f*v*dx + g*v*ds
u = Function(V)
solve(a == L, u, bc)

fig = plt.figure(figsize=(6.4,4.8),dpi=200)
plot(u)
fig.savefig('u_mpirank_'+str(mpi_rank)+'.png')
plt.close(fig)



######  need to gather from all u to get u_full


###### perform some operations on u_full, let's say, u_full_new = sin(u_full) 


######  scatter u_full_new back to all u 


What is the motivation for doing such an operation on a single process, and not on the distributed mesh.

For instance, mapping u->sin(u) should be done via either

  1. A projection
  2. An interpolation

Are there other operations you have in mind, that has to be done on a single process?

Hello dokken,

It is silly to have to do sin(u) on the full solution, I came up with this only for this testing problem purpose. I need to understand how gathering and scattering works in fenics, so that I could nest a parallelized fenics script inside a serial optimization code.

Why not split it into multiple scripts:

  1. run parallel fenics and write solution as checkpoint
  2. run serial optimization algorithm reading in the checkpoint. Save serial output as checkpoint
  3. run parallel algorithm on checkpoint from 2

These three steps can even be written in one code, with appropriate MPI barriers, so you dont need to do any special communication.

If you dont want to do this, could you go into more detail about what the serial algorithm should do? Should it use DOLFIN in serial, or is it just working on a global version of the DOLFIN function array (i.e. directly on the dof values).

The actual fenics script that need to be run parallel involves calculation of an FE problem, cost function calculation, gradient and hessian calculation analytically and sometimes with sampling. These info are to be passed on to an optimizer, which passes back a new set of candidate parameters and iterate. The optimizer is from scipy, independent of fenics.

Pyadjoint (dolfin-adjoint) has implemented an interface with scipy that works in parallel. See for instance
http://www.dolfin-adjoint.org/en/latest/documentation/stokes-bc-control/stokes-bc-control.html
which uses the scipy interface of pyadjoint : pyadjoint/optimization.py at master · dolfin-adjoint/pyadjoint · GitHub
If you want to do such calculations by hand, I would suggest looking at how it is solved in pyadjoint.