Multi node job does not run in parallel

I am trying to run my job on multiple nodes. but when i look at the output, it seems like the job is not running in parallel.

this is my job submission script :slight_smile:

#!/bin/bash
#SBATCH --job-name=gather
#SBATCH -N 4
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --exclusive
#SBATCH --switches=1
#SBATCH --time=14-00:00:00
#SBATCH --partition=normal

module load python-3.9.6-gcc-8.4.1-2yf35k6

echo "Job $SLURM_JOB_ID running on SLURM NODELIST: $SLURM_NODELIST"

mpirun -n 4 singularity exec /project/user/fenics/fenics.simg python3 ./run_test.py  

I am getting output like

Job 30456205 running on SLURM NODELIST: rome[067-070]
Rank 0: 8128311 vertices (local)
Rank 0: 8128311 vertices (local)
Rank 0: 8128311 vertices (local)
Rank 0: 8128311 vertices (local)
Solving linear variational problem.
Solving linear variational problem.
Solving linear variational problem.
Solving linear variational problem.

It seems like it is not distributing the mesh across the nodes. If I use singularity exec -e /project/user/fenics/fenics.simg mpirun -n 4 python3 ./run_test.py > output_test it now seems like it is distributing the mesh, but now the problem is the job runs using only one node. Other nodes, although showing assigned, remain idle.