How to use mpirun to run an code which is already developed in multiprocessing style

Hello,
I develop a decoupled method for a multi-physics domain problem. In this algorithm, I need to solve two problems within a time step in parallel on different domains(mesh), respectively. Thus, what I come out is to apply the multiprocessing library in Python, building two child processes to solve each problem at the same time. This seems work successfully.
Now, since a 3D problem with a complex geometric shape and fine mesh is going to be solved, I want to apply domain decomposition method in each subproblem(subprocess)to accelerate. I know that one can call mpirun - np n python codename.py to apply a DDM in parallel. Nevertheless, it doesn’t work for my code, as one process doesn’t receive the message from the other process and thus gets stucked.
I really hope to know how to fix it. Your guidance and assistance would be greatly appreciated!

Attached is the pesudo code.

def A_Solver(x, pipe):

    conn1, conn2 = pipe
    ““”
    1. load meshA.xdmf
    2. boundary mark
    3. define different kinds of function spaces
    4. define varialtional form
    5. define boundary condition & initial condition
    ““”
    # time iteration
    for n in range(step):
        t += kdt

        # Send solutions for present time step to Process B 
        send_data = u_fem.vector().get_local()
        conn1.send(send_data)

        # Receive solution for previous step from Process B
        recv_datas = conn1.recv()
        B_old.vector()[:] = recv_datas
        B_old.set_allow_extrapolation(True)

        # Solve problem A
        solve(a_f == L_f, A_present, bc_f, solver_parameters=parameters2)

        # update function
        A_old.assign(A_present)

The function B_solver is defined in a similar way, which is omitted here. And the main process is attached below

from fenics import *
import numpy as np
import multiprocessing as mp

if __name__ == '__main__':

    conn1, conn2 = mp.Pipe()

    p_A = mp.Process(target=A_Solver, args=(10, (conn1, conn2)))
    p_B = mp.Process(target=B_Solver, args=(10, (conn1, conn2)))
    
    p_A.start()
    print("A Process Start!")

    p_B.start()
    print("B Process Start!")

    conn1.close()
    conn2.close()

    p_A.join()
    p_B.join()  

There is nothing we can do unless you post the real code. Please simplify it as much as you can.

Thanks for your repply!
Here is a toy model for a 2D coupled Stokes-Darcy problem, in which there are two domains – Stokes domain and Darcy’s domain. By decoupling this problem, one can decompose it into a Stokes problem and a Darcy’s problem, using values on interface at time step n-1 to calculate values at time step n in parallel.

At first, we define two solvers, Stokes_Solver and Darcy_Solver, respectively. Here we apply the pipe from multiprocessing, a parallel computing library of Python, to send data from each other. Each solver will be a complete solver when it receives data on the interface from the other side.

def Stokes_Solver(x, pipe):
    
    conn1, conn2 = pipe
    
    # Stokes and Darcy mesh
    domain_f = mshr.Rectangle(Point(0.0, 0.0), Point(6.0, 1.0))
    domain_d = mshr.Rectangle(Point(1.0, 0.0), Point(3.0, 0.5))
    mesh_f = mshr.generate_mesh(domain_f - domain_d, 200)
    mesh_d = mshr.generate_mesh(domain_d, 100)
    
    x_f = SpatialCoordinate(mesh_f)
    n_f = FacetNormal(mesh_f)
    tor_f = as_vector((-n_f[1], n_f[0]))
        
    ########################### Boundary Mark ###########################
    # Mark Rule: Bottom: 1 || Right: 2 || Top:3 || Left: 4 || Interface: 5 #

    ############## Stokes Domain ##############
    class Bottom_F(SubDomain):
        def inside(self, x, on_boundary):
            return on_boundary and near(x[1], 0.0)

    class Right_F(SubDomain):
        def inside(self, x, on_boundary):
            return on_boundary and near(x[0], 6.0)
        
    class Left_F(SubDomain):
        def inside(self, x, on_boundary):
            return on_boundary and near(x[0], 0.0)
        
    class Top_F(SubDomain):
        def inside(self, x, on_boundary):
            return on_boundary and near(x[1], 1.0)
        
    class Interface_F(SubDomain):
        def inside(self, x, on_boundary):
            return on_boundary and (near(x[0], 1.0) or near(x[0], 3.0) or near(x[1], 0.5))

    bottom_f    = Bottom_F()
    right_f     = Right_F()
    left_f      = Left_F()
    top_f       = Top_F()
    interface_f = Interface_F()
    
    # Mark Boundary
    bmf_f = MeshFunction("size_t", mesh_f, mesh_f.topology().dim()-1, 0)
    interface_f.mark(bmf_f, 5)
    bottom_f.mark(bmf_f, 1)
    right_f.mark(bmf_f, 2)
    top_f.mark(bmf_f, 3)
    left_f.mark(bmf_f, 4)
    # Boundary Integration Operator
    ds_f = Measure("ds", domain=mesh_f, subdomain_data=bmf_f)
    
    ######################################################################
    ########################### Function Space ###########################
    
    ################## Stokes Function Space ##################
    u_elem = VectorElement("CG", mesh_f.ufl_cell(), 2)
    p_elem = FiniteElement("CG", mesh_f.ufl_cell(), 1)
    elem_f = MixedElement([u_elem, p_elem])
    V_f    = FunctionSpace(mesh_f, elem_f)
    
    u, p           = TrialFunctions(V_f)
    v, q           = TestFunctions(V_f)
    Stokes_old     = Function(V_f)
    u_old, p_old   = split(Stokes_old)
    Stokes_present = Function(V_f)
    u_fem, p_fem   = Stokes_present.split(True)
    
    ################## Darcy Function Space ##################
    phi_elem  = FiniteElement("CG", mesh_d.ufl_cell(), 2)
    V_d       = FunctionSpace(mesh_d, phi_elem)
    phi_old   = Function(V_d)
    
    ##################### Boundary Conditions #####################
    # Stokes equation
    u_in_bc     = DirichletBC(V_f.sub(0), Constant((5.0, 0.0)), bmf_f, 4)
    u_bottom_bc = DirichletBC(V_f.sub(0), Constant((0.0, 0.0)), bmf_f, 1)
    u_top_bc    = DirichletBC(V_f.sub(0), Constant((0.0, 0.0)), bmf_f, 3)
    p_right_bc  = DirichletBC(V_f.sub(1), Constant(0.0), bmf_f, 2)
    
    bc_f        = [u_in_bc, u_bottom_bc, u_top_bc, p_right_bc]
    
    def epsilon(eta):
        # Strain Rate Tensor
        return sym(grad(eta))
    
    # Source Term
    f_f = Constant((0.0, 0.0))
    
    dt = Constant(0.01)
    
    ####################### Variational Form ######################
    a_f = dot(u, v)*dx \
        + 2*dt*inner(epsilon(u), epsilon(v))*dx \
        - dt*p*div(v)*dx \
        + dt*q*div(u)*dx \
        + dt*dot(u, tor_f)*dot(v, tor_f)*ds_f(5) \
        + dt*dot(u, n_f)*dot(v, n_f)*ds_f(5)
        
    L_f = dt*dot(f_f, v)*dx \
        + dot(u_old, v)*dx \
        + dt*dot(u_old, n_f)*dot(v, n_f)*ds_f(5) \
        - dt*phi_old*dot(v, n_f)*ds_f(5)
        
    ######################### Time Iteration ########################
    for n in range(10000): 
        
        # Send solutions for present time step to Process Darcy
        send_data = u_fem.vector().get_local()
        conn1.send(send_data)

        # Receive Solution for previous step from Process Darcy
        recv_datas = conn1.recv()
        phi_old.vector()[:] = recv_datas
        phi_old.set_allow_extrapolation(True)
        
        # Solve the Stokes equation
        solve(a_f == L_f, Stokes_present, bc_f)
        u_fem, p_fem = Stokes_present.split(True)
        
        # Update function
        Stokes_old.assign(Stokes_present)
def Darcy_Solver(x, pipe):
    
    conn1, conn2 = pipe
    
    # Stokes and Darcy mesh
    domain_f = mshr.Rectangle(Point(0.0, 0.0), Point(6.0, 1.0))
    domain_d = mshr.Rectangle(Point(1.0, 0.0), Point(3.0, 0.5))
    mesh_f = mshr.generate_mesh(domain_f - domain_d, 200)
    mesh_d = mshr.generate_mesh(domain_d, 100)
    
    x_d = SpatialCoordinate(mesh_d)
    n_d = FacetNormal(mesh_d)
    tor_d = as_vector((-n_d[1], n_d[0]))
    
    ########################### Boundary Mark ###########################
    # Mark Rule: Bottom: 1 || Right: 2 || Top:3 || Left: 4 || Interface: 5 #
    ############## Biot Domain ##############
    class Top_D(SubDomain):
        def inside(self, x, on_boundary):
            return on_boundary and near(x[1], 0.5)
    
    class Right_D(SubDomain):
        def inside(self, x, on_boundary):
            return on_boundary and near(x[0], 3.0)
        
    class Left_D(SubDomain):
        def inside(self, x, on_boundary):
            return on_boundary and near(x[0], 1.0)
        
    class Bottom_D(SubDomain):
        def inside(self, x, on_boundary):
            return on_boundary and near(x[1], 0.0)
        
    class Interface_D(SubDomain):
        def inside(self, x, on_boundary):
            return on_boundary and (near(x[1], 0.5) or near(x[0], 1.0) or near(x[0], 3.0)) 
    
    top_d       = Top_D()
    right_d     = Right_D()
    left_d      = Left_D()
    bottom_d    = Bottom_D()
    interface_d = Interface_D()
    
    # Mark Boundary
    bmf_d = MeshFunction("size_t", mesh_d, mesh_d.topology().dim()-1, 0)
    interface_d.mark(bmf_d, 5)
    bottom_d.mark(bmf_d, 1)
    # Boundary Integration Operator
    ds_d = Measure("ds", domain=mesh_d, subdomain_data=bmf_d)
    
    ######################################################################
    ########################### Function Space ###########################
    
    ################## Darcy Function Space ##################
    phi_elem = FiniteElement("CG", mesh_d.ufl_cell(), 2)
    V_d      = FunctionSpace(mesh_d, phi_elem)
    
    phi         = TrialFunction(V_d)
    psi         = TestFunction(V_d)
    phi_old     = Function(V_d)
    phi_present = Function(V_d)
    
    ################## Stokes Function Space ##################
    u_elem     = VectorElement("CG", mesh_f.ufl_cell(), 2)
    p_elem     = FiniteElement("CG", mesh_f.ufl_cell(), 1)
    elem_f     = MixedElement([u_elem, p_elem])
    V_f        = FunctionSpace(mesh_f, elem_f)
    Stokes_old = Function(V_f)
    u_old, _   = Stokes_old.split(True)
    
    ##################### Boundary Conditions #####################
    phi_bottom_bc = DirichletBC(V_d, Constant(0.0), bmf_d, 1)
    bc_d = [phi_bottom_bc]
    
    def epsilon(eta):
        # Strain Rate Tensor
        return sym(grad(eta))
    
    # Source Term
    f_d = Constant(0.0)
    
    dt = Constant(0.01)
    
    ####################### Variational Form ######################
    a_d = phi*psi*dx \
        + dt*dot(grad(phi), grad(psi))*dx \
        + dt*phi*psi*ds_d(5)
         
    L_d = dt*f_d*psi*dx \
        + phi_old*psi*dx \
        - dt*dot(u_old, n_d)*psi*ds_d(5) \
        + dt*phi_old*psi*ds_d(5)
         
    ############################ Time Iteration ###########################
    for n in range(10000):
        
        # Receive Solution for previous step from Process Stokes 
        recv_datas = conn2.recv()
        u_old.vector()[:] = recv_datas
        u_old.set_allow_extrapolation(True)

        # Send solutions for present time step to Process Stokes 
        send_datas = phi_present.vector().get_local()
        conn2.send(send_datas)
        
        # Solve the Darcy equation
        solve(a_d == L_d, phi_present, bc_d)
        
        # update function
        phi_old.assign(phi_present)
        
        print(f"------- Solving step {n} -------")

Then, we define a main process as below. In this main process, we create two subprocesses to call solvers for Stokes problem and Darcy’s problem.

if __name__ == '__main__':
    conn1, conn2 = mp.Pipe()

    p_Stokes = mp.Process(target=Stokes_Solver, args=(10, (conn1, conn2)))
    p_Darcy  = mp.Process(target=Darcy_Solver, args=(10, (conn1, conn2)))
    
    p_Stokes.start()
    print("Stokes Process Start!")

    p_Darcy.start()
    print("Darcy Process Start!")

    conn1.close()
    conn2.close()

    p_Stokes.join()
    p_Darcy.join()  

This does work when we call python demo.py. The thing is, when dealing with 3D problems, each subproblem solves slowly. Thus, we hope to apply domain decomposition method in each domain (process), respectively. However, it fails when we call mpirun - np 20 python demo.py
Is it possible to conduct DDM in each process? Your assistance and guidance would be greatly appreciated!

First of all, using mshr to generate meshes inside a parallel (mpirun) code is not recommended.

To get anywhere near what you would like to do, i would split the mpi communicator in 2, as shown in

But then you should Also create your mesh ahead of time, and read it in with XDMFFile for efficiency