Hello everyone,
When I am creating a toy high-order Lagrange functionspace, I find that my code get significantly slower with the increase of MPI processes. The same thing doesn’t happen for low-order functionspace. Here is my script:
import numpy as np
from dolfinx import cpp, mesh, fem
from mpi4py import MPI
import time
comm = MPI.COMM_WORLD
L, W, H=10, 10, 10
NX, NY, NZ = 100, 100, 100
points=[np.array([0, 0, 0]), np.array([L, W, H])]
start = time.perf_counter()
domain=mesh.create_box(
comm,
points,
[NX,NY,NZ],
cell_type=mesh.CellType.hexahedron,
ghost_mode=mesh.GhostMode.shared_facet
)
end_0 = time.perf_counter()
V = fem.functionspace(domain, ("Lagrange", 1))
end_1 = time.perf_counter()
owned_dof_num = V.dofmap.index_map.size_local
ghost_dof_num = V.dofmap.index_map.num_ghosts
starts = comm.gather(start, root=0)
end_0s = comm.gather(end_0, root=0)
end_1s = comm.gather(end_1, root=0)
owned_dof_nums = comm.gather(owned_dof_num, root=0)
ghost_dof_nums = comm.gather(ghost_dof_num, root=0)
if comm.rank == 0:
print(f"average # of owned dofs {sum(owned_dof_nums)/comm.size} average # of ghost dofs {sum(ghost_dof_nums)/comm.size} average mesh time {(sum(end_0s)-sum(starts))/comm.size} average functionspace time {(sum(end_1s)-sum(end_0s))/comm.size}")
When the order is 1, the results are:
$ mpirun -np 1 python try.py
average owned dofs 1030301.0 average ghost dofs 0.0 average meshing time 3.988608295097947 average functionspace time 0.12935276422649622
$ mpirun -np 2 python try.py
average owned dofs 515150.5 average ghost dofs 15301.5 average meshing time 3.443682523444295 average functionspace time 0.07836870476603508
$ mpirun -np 4 python try.py
average owned dofs 257575.25 average ghost dofs 17590.5 average meshing time 1.7969944467768073 average functionspace time 0.04848852753639221
$ mpirun -np 8 python try.py
average owned dofs 128787.625 average ghost dofs 13583.125 average meshing time 0.9688909612596035 average functionspace time 0.03316149767488241
Everything goes well. However, when the Lagrange order is set to 4 (I decrease the mesh size as well to keep the dofs in the same magnitude, avoiding memory-related stuff):
NX, NY, NZ = 30, 30, 30
...
V = fem.functionspace(domain, ("Lagrange", 4))
The results are:
$ mpirun -np 1 python try.py
average owned dofs 1771561.0 average ghost dofs 0.0 average meshing time 0.07663028221577406 average functionspace time 0.1555685205385089
$ mpirun -np 2 python try.py
average owned dofs 885780.5 average ghost dofs 65884.5 average meshing time 0.07683719042688608 average functionspace time 5.366741458885372
$ mpirun -np 4 python try.py
average owned dofs 442890.25 average ghost dofs 68716.75 average meshing time 0.07153452932834625 average functionspace time 9.33089439664036
$ mpirun -np 8 python try.py
average owned dofs 221445.125 average ghost dofs 54979.875 average meshing time 0.061317757703363895 average functionspace time 14.544806653633714
I think this is partly due to the increase of ghost dofs in high-order functionspace. However, should the time overhead be so large?