Fail to run the demo code in parallel with MPI

bearsan · January 16, 2022, 3:07pm

Dear community,

I am trying to run the demo code of incompressible Navier-Stokes equations from Bitbucket in parallel with MPI.

However, according to the timing summary, there is no improvement in computational time with increasing number of processors. It seems like that the code is executed several times individually. The timing summary is listed in the following figure

The FEniCS is installed on Docker, and it is the latest stable version. The total number of cores in my computer is 8. Did I misuse the MPI command? Or should I add some extra codes to parallelize the demo?

Could you please guide me to some resources or tips which would help me solve this problem?

Thank you,
Best regards

nate · January 16, 2022, 4:11pm

Try running the following:

mpirun -np 2 python3 -c "from mpi4py import MPI; print(MPI.COMM_WORLD.rank)"

You should see

0
1

bearsan · January 17, 2022, 12:40am

Thank you so much for your reply. I tried this command and got the same output as yours

mpirun -np 2 python3 -c "from mpi4py import MPI; print(MPI.COMM_WORLD.rank)"

0
1

dokken · January 17, 2022, 8:41am

The reason for the code not speeding up is that the problem is very small (1000 DOFS in velocity space and 100 in the pressure space). Running code in parallell is useful for large problems, as the mesh is partitioned and distributed over more processes. For small problems such as this, the communication will take as much time as the speed-up of the partitioning.
This can be illustrated by refining the mesh:

for i in range(2):
    mesh = refine(mesh)

which will yield the following output:

fenics@3d1c51f37d8c:/root/shared/navier-stokes$ time sudo  mpirun -n 1 python3 demo_navier-stokes.py 
15170 1937

real    0m21.718s
user    1m20.987s
sys     4m13.590s
fenics@3d1c51f37d8c:/root/shared/navier-stokes$ time sudo  mpirun -n 2 python3 demo_navier-stokes.py 
real    0m10.502s
user    0m20.699s
sys     0m1.968s
fenics@3d1c51f37d8c:/root/shared/navier-stokes$ time sudo  mpirun -n 4 python3 demo_navier-stokes.py 
real    0m7.383s
user    0m28.660s
sys     0m2.509s

As you can observe here, going from one to two processes gives you a significant speedup. However, as we go from 2 to 4 processes, we see that the runtime is decreased, but not halfed, as the number of dofs on each process decreases.

bearsan · January 17, 2022, 9:15am

Thank you very much for your help! Based on your refinement suggestion, I can observe the speedup in parallel now.

Topic		Replies	Views
Code not running in parallel	8	2484	February 17, 2021
Problems in running in parallel dolfinx	10	1158	April 21, 2023
Help with parallel processing of a viscoelastic stokes flow system	3	724	June 16, 2021
FEniCS + MPI on docker inefficient?	12	2550	September 12, 2020
MPI performances limit (Incompressible NS on MixedSpace) General	8	372	May 4, 2023

Fail to run the demo code in parallel with MPI

Related topics