Gathering function on root process (MPI) issue with mixed element

Hi,

I am using FenicsX vs 0.7.3 to solve a simple non-linear PDE consistenting of two distinct fields on the same domain in 1D.

In serial (i.e. mpirun -n 1) everything works fine. However, when I attempt to run it in parallel I am having issues gathering all the data on one process in order to write it to a file etc.

Where msh=create_interval(...) and u=Function(V), and V=functionspace(...) (see below for minimal example), the issues are:

  1. I know how to get the number of dofs in u using num_dofs_local = V.dofmap.index_map.size_local * V.dofmap.index_map_bs. But when collecting the data I want to know how many points are on the mesh (should be the same for each field). Currently I am using num_points_local = int(np.ceil(num_dofs_local/2)). Is there a better way?

  2. gathering the two fields independently using msh.comm.gather(u.sub(0).x.array[:num_points_local]) returns incorrect data - that is it returns a mix of data points from both u.sub(0) and u.sub(1). This is illustrated in first set of output below where it should have sequential positive numbers for u.sub(0) and sequential negative numbers for u.sub(1). I know this lacks the collapse() call (see point 3), but this does run in mpirun with n > 1, so I use it as a reference.

  3. If I instead use (what I believe to be the correct form) msh.comm.gather(u.sub(0).collapse().x.array[:num_points_local]) then I get the expected result for mpirun -n 1 only. If rI run more than one process then the code hangs at this line.

What am I doing wrong?

The minimal code to create my issue is below:

from mpi4py import MPI
import dolfinx
from dolfinx.fem import Function, functionspace
from dolfinx.mesh import create_interval
from basix.ufl import (element, mixed_element)
import numpy as np
import sys

comm = MPI.COMM_WORLD

def printf(s):
    print(f"Rank {comm.rank}: {s}")
    sys.stdout.flush()

###### FUNCTION SPACE ######
    
L = 10
mesh_size = 10
x_left = 0.0
x_right = L

msh = create_interval(comm, points=(x_left, x_right), nx=mesh_size, ghost_mode=dolfinx.cpp.mesh.GhostMode.shared_facet)
P1 = element("Lagrange", msh.basix_cell(),1)
V = functionspace(msh, mixed_element([P1, P1]))

u = Function(V)  # current solution
u_prev = Function(V) 

###### INITIAL CONDITIONS ######
# designed to be increasing positive values for u.sub(0) and decreasing negative values for u.sub(1)

class InitialTestState():
    def __init__(self):
        return
    def f0(self,x):
        values = np.zeros((1, x.shape[1]))
        for i,v in enumerate(x[0]):
            values[0][i] = v
        return values
    def f1(self,x):
        values = np.zeros((1, x.shape[1]))
        for i,v in enumerate(x[0]):
            values[0][i] = -v
        return values

initial_condition = InitialTestState()
u.sub(0).interpolate(initial_condition.f0)
u.sub(1).interpolate(initial_condition.f1)
u.x.scatter_forward()  

###### GATHER DATA ##########

num_dofs_local = V.dofmap.index_map.size_local * V.dofmap.index_map_bs
num_points_local = int(np.ceil(num_dofs_local/2)) ## <----- better way?
global_vals0 = msh.comm.gather(u.sub(0).collapse().x.array[:num_points_local])
global_vals1 = msh.comm.gather(u.sub(1).collapse().x.array[:num_points_local])
global_x = msh.comm.gather(msh.geometry.x[:num_points_local,0])


if comm.rank == 0:

    print("\nUnexpected result, but runs for all n \n\n")

    # gives wrong data: should be 0,1,2,3, and 0,-1,-2,-3,-4 or similar
    print("Local values:")
    printf(u.sub(0).x.array[:num_points_local])
    printf(u.sub(1).x.array[:num_points_local])
    print("Collected values:")
    root_vals0 = np.concatenate(global_vals0)
    printf(root_vals0)
    root_vals1 = np.concatenate(global_vals1)
    printf(root_vals1)

    print("\n\nExpected result: only works for n=1 \n\n")

    # gives correct data but hangs for n> 1 in mpirun -n $n minimal_example.py
    print("Local values:")
    printf(u.sub(0).collapse().x.array[:num_points_local])
    printf(u.sub(1).collapse().x.array[:num_points_local])
    print("Collected values:")
    root_vals0 = np.concatenate(global_vals0)
    printf(root_vals0)
    root_vals1 = np.concatenate(global_vals1)
    printf(root_vals1)

Thanks,

R.

You cannot collapse a distributed function space only on rank 0.

Please move u.sub(0).collapse() outside the if statement.

What you you mean with “How many points that are on the mesh”?
Do you mean, how many dofs there are in each sub space?

To do so, collapse each space with V_0, _ = V.sub(0).collapse() and then call V_0.dofmap.index_map.size_local * V_0.dofmap.index_map_bs.

Ah, ok, thank you. I had no idea that collapse was parallel in this way. I tried looking at the documentation for clues - in general, actually, because all the parallelism is obfuscated, it might be helpful if they stated if they are meant to be distributed over all ranks. It simply didn’t occur to me that you couldn’t do that in a single process. I appreciate its use in this way might be an edge case though (e.g. I was using it for debugging I guess).

Yes, exactly. Thank you, again. Seems much more obvious in hindsight, :slight_smile: