Parallel for-loop with multiple solve-calls, MPI

I’ve read that post carefully and tried mesh = UnitSquareMesh(MPI.comm_self, as well. Now I understand that this is the main point of running my code in parallel, but it didn’t make any difference in terms of error I was getting, thats why I excludede it from MWE.

But what I didn’t know, is that we can create new dolfin objects from the old ones not diractly but through an array and avoid this way that pickle error.

Thank you very much for your help!

The following Code shows how to deal with more parameters (more then available processes). I just scatter all parameters by the processes, solve the PDEs in each process in for-loop and gather the solutions at the end:

from dolfin import *
import numpy as np

comm = MPI.comm_world
size = comm.Get_size() 
rank = comm.Get_rank()

def poisson(alpha):
    mesh = UnitSquareMesh(MPI.comm_self, 256, 256)
    V = FunctionSpace(mesh, 'CG', 1)
    u, v = TrialFunction(V), TestFunction(V)
    bc = DirichletBC(V, Constant(0), 'on_boundary')
    a = inner(Constant(alpha)*grad(u), grad(v))*dx
    L = inner(Constant(1), v)*dx
    uh = Function(V)
    solve(a == L, uh, bc)
    return uh 

PDEsnumb = 100
numDataPerRank = int(PDEsnumb/size)

alpha = None

if rank == 0:
    # here should be an np.array of parameters
    # array of ones is used to show that solutions assembling works fine
    alpha = np.ones(100)

my_alpha = np.empty(numDataPerRank, dtype='d')
comm.Scatter(alpha, my_alpha, root=0)

u_sol_single = poisson(my_alpha[0])
u_sol_function_space = u_sol_single.function_space()
u_sol_array =  u_sol_single.vector().get_local()
u_sol_length = len(u_sol_array) 

u_sol = u_sol_array

for i in range(1, numDataPerRank):
    u_sol = np.concatenate((u_sol, poisson(my_alpha[i]).vector().get_local()))
    
u_sol_gath = None

if rank == 0:
    u_sol_gath = np.empty(PDEsnumb*u_sol_length, dtype='d') 

comm.Gather(u_sol, u_sol_gath, root=0)

if rank == 0:
    u_output = []

    for i in range(PDEsnumb):
        u_output.append(Function(u_sol_function_space)) 
        u_output[i].vector()[:] = u_sol_gath[i*u_sol_length : i*u_sol_length + u_sol_length]
    
    k = u_output[0]

    for i in u_output:
        # this shows that for the same parameters solutions are equal 
        print(assemble(((k-i)**2)*dx))

Unfortunatly this gathering procedure wasn’t necessary for the Optimization, because in time distributed control we have the while loop for time-stepping which must be computed in serial, and inside of this loop we have the for-loop, which is running in paralle. I.e. it should be something like this:

if rank == 0:
    # some code
    while t <= T:
        # the code that must be run on the all processes, 
        # i.e. it can't be inside the 'if rank==0' statement 
        t += dt

So it seems imposible with MPI.

Another idea is to compute forward model completly in parallel and optimize in serial. I scatter PDEs parameters and broadcast the controls, on each process compute forward model and then gather objective function values on single process and run the optimization there.

How do you think, it can work out? Can dolfin adjoint catch such forward model and use it further?

Trying to implement it I’ve already spent a lot of time, and there are still a lot of questionable things. So I’m already not sure if it’s worth spending more time on MPI without knowing that the idea is right.