Code not running in parallel

Hirshikesh_Raj · February 15, 2021, 9:14am

I am trying to run code in parallel but it seems it just running 4 times. Output is

Hello I am process 0
Hello I am process 0
Hello I am process 0
Hello I am process 0

Please find MWE

from dolfin import *
from mpi4py import MPI as mpi
comm = mpi.COMM_WORLD
ip = comm.Get_rank()
print("Hello I am process ",ip)
mesh = UnitIntervalMesh(MPI.comm_world,10)
V = VectorFunctionSpace(mesh, 'CG', 1)

dokken · February 15, 2021, 9:21am

I cannot reproduce this behavior using the docker image:

docker run -it -v $(pwd):/home/fenics/shared -w /home/fenics/shared --rm  quay.io/fenicsproject/dev:latest

which gives output

fenics@ae1d6668589f:~/shared$ mpirun -n 4 python3 test_mpi.py 
Hello I am process  0
Calling FFC just-in-time (JIT) compiler, this may take some time.
Calling FFC just-in-time (JIT) compiler, this may take some time.
Hello I am process  1
Hello I am process  2
Hello I am process  3

Therefore, for anyone to help you, you need to supply more information about how you installed FEniCS, what system you are running on, etc. Follow the guidelines in: Read before posting: How do I get my question answered?

Hirshikesh_Raj · February 15, 2021, 9:26am

I have installed FEniCS (dolfin-version: 2019.2.0.dev0) in ubuntu using these commands:

sudo apt-get install software-properties-common
sudo add-apt-repository ppa:fenics-packages/fenics
sudo apt-get update
sudo apt-get install --no-install-recommends fenics

dokken · February 15, 2021, 9:40am

Try removing the --no-install-recommends flag when installing. I cannot reproduce your error with ubuntu:20.04:

docker run -it -v $(pwd):/home/fenics/shared -w /home/fenics/shared --rm  ubuntu:20.04

apt-get install software-properties-common
add-apt-repository ppa:fenics-packages/fenics
apt-get update
apt-get install fenics

root@62a587c0e67c:/home/fenics/shared# mpirun -n 4 --allow-run-as-root python3 test_mpi.py
Hello I am process  0
Hello I am process  1
Hello I am process  2
Hello I am process  3
Calling FFC just-in-time (JIT) compiler, this may take some time.
Calling FFC just-in-time (JIT) compiler, this may take some time.

What version of ubuntu are you using?

Hirshikesh_Raj · February 15, 2021, 9:45am

I am using Ubuntu 18.04.4

dokken · February 15, 2021, 10:32am

Are you using mpich or openmpi on your local system?
See: parallel execution does not work (anymore?) - FEniCS Q&A
If I am not mistaken, fenics uses mpich by default.

Did you try removing the flag no-install-recommends?

dparsons · February 16, 2021, 5:08pm

If I am not mistaken, fenics uses mpich by default.

fenics itself would use mpiexec (or mpirun) and build with mpic++ which (the command name, that is) is independent of MPI implementation. That is, both OpenMPI and MPICH provide their own version of mpirun, and /usr/bin/mpirun is a symlink to the preferred version configured for the given system. fenics will build against whichever MPI implementation has been configured.

On default Debian (and Ubuntu) systems, openmpi is configured as the preferred MPI alternative.

If a local system administrator has switched the preferred MPI to mpich then that could cause problems, since the entire numerical library stack would then need to be recompiled against mpich. It’s too much work to maintain library packages built against and available for both MPI implementations, so Debian doesn’t do that (it’s been done for some low level libraries like scalapack). The pkgconfig file dolfin.pc references the specific MPI implementation that dolfin was built against.

dparsons · February 16, 2021, 5:13pm

That said, the dolfin build scripts (FindMPI.cmake) do have a bias towards mpich in the sense of searching for libmpich.so if libmpi.so can’t be found.

nami · February 17, 2021, 4:52am

I had the same problem but when running in parallel on a cluster. The issue was an mpich version conflict and since I had to run it via a singularity container mpich had to be called before the parallel command. Since you are running it on a personal Linux computer the preferred MPI implementation is OpenMPI as dparsons mentioned and it should work fine. My guess is you either have version conflicts or installation problems.
On a different note, while I had the same issue the performance was still improved. True that the communicators are spitting out nonsense but I am sure that the work was still distributed and the performance was better. I also understand that it is a drag when it comes to monitoring the efficacy of your code.

Topic		Replies	Views
Getting MPI rank as 0 when running in parallel locally installation	1	677	October 2, 2020
Disable internal parallelization	5	554	April 29, 2022
MPI in fenics 2019.1.0 docker container	2	1564	June 9, 2019
Parallel Processing_error	5	249	June 18, 2022
How to run fenics in parallel?	2	1693	October 2, 2019

Code not running in parallel

Related topics