SLEPC solver running in Parallel

If you’re not getting expected parallel speed-up, that can sometimes be a result of BLAS configuration. There are a couple of BLAS questions to consider

  1. The default BLAS implementation is generic, not tuned to any processor and therefore relatively slow. Better performance would be available from specialized implementations such as OpenBLAS (this applies to LAPACK also),
  2. For BLAS parallelization you want a BLAS configuration that uses threads (pthreads or openmp) cf. Multicore and Ubuntu - #8 by dparsons . This particularly applies when running as a single-process job (see next point).
  3. DOLFIN (and PETSc/SLEPc) are MPI software. MPI is a different parallelization to threading and it’s hard to make the two techniques cooperate effectively at the same time. If you’re running as an MPI job, then you might get better performance blocking BLAS threads by setting the environment variable OMP_NUM_THREADS=1 , cf. Multicore and Ubuntu - #7 by dparsons
1 Like