Encountered PETSc error while running with MPI

Karl_Banana · May 23, 2021, 9:50am

Hi, I encountered the following error while running a multi-level monte carlo sampling. I am using dolfin solver c++. Each mpi process will run a single process PETScKrylovSolver using mumps and sample with different coefficient repetitively.

However, in occasions, I encounter the following error. And I could not reproduce it as if I run the solver for that particular failed coefficient alone with one process, I got a good solution.

I’m clueless now, not even sure where to look into for more info about what’s going on. I tried to output the ksp solver information, but thing seems all right before the error. Any recommendations on how to debug is appreciated.

Process 0: *** Warning: Verbose output for PETScKrylovSolver not implemented, calling PETSc KSPView directly.
KSP Object: 1 MPI processes
  type: preonly
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using NONE norm type for convergence test
PC Object: 1 MPI processes
  type: cholesky
    out-of-place factorization
    tolerance for zero pivot 2.22045e-14
    matrix ordering: natural
    factor fill ratio given 0., needed 0.
      Factored matrix follows:
        Mat Object: 1 MPI processes
          type: mumps
          rows=240, cols=240
          package used to perform factorization: mumps
          total: nonzeros=3765, allocated nonzeros=3765
          total number of mallocs used during MatSetValues calls =0
            MUMPS run parameters:
              SYM (matrix type):                   2 
              PAR (host participation):            1 
              ICNTL(1) (output for error):         6 
              ICNTL(2) (output of diagnostic msg): 0 
              ICNTL(3) (output for global info):   0 
              ICNTL(4) (level of printing):        0 
              ICNTL(5) (input mat struct):         0 
              ICNTL(6) (matrix prescaling):        7 
              ICNTL(7) (sequential matrix ordering):2 
              ICNTL(8) (scaling strategy):        77 
              ICNTL(10) (max num of refinements):  0 
              ICNTL(11) (error analysis):          0 
              ICNTL(12) (efficiency control):                         0 
              ICNTL(13) (efficiency control):                         1 
              ICNTL(14) (percentage of estimated workspace increase): 20 
              ICNTL(18) (input mat struct):                           0 
              ICNTL(19) (Schur complement info):                      0 
              ICNTL(20) (rhs sparse pattern):                         0 
              ICNTL(21) (solution struct):                            0 
              ICNTL(22) (in-core/out-of-core facility):               0 
              ICNTL(23) (max size of memory can be allocated locally):0 
              ICNTL(24) (detection of null pivot rows):               1 
              ICNTL(25) (computation of a null space basis):          0 
              ICNTL(26) (Schur options for rhs or solution):          0 
              ICNTL(27) (experimental parameter):                     -32 
              ICNTL(28) (use parallel or sequential ordering):        1 
              ICNTL(29) (parallel ordering):                          0 
              ICNTL(30) (user-specified set of entries in inv(A)):    0 
              ICNTL(31) (factors is discarded in the solve phase):    0 
              ICNTL(33) (compute determinant):                        0 
              ICNTL(35) (activate BLR based factorization):           0 
              CNTL(1) (relative pivoting threshold):      0.01 
              CNTL(2) (stopping criterion of refinement): 1.49012e-08 
              CNTL(3) (absolute pivoting threshold):      1e-08 
              CNTL(4) (value of static pivoting):         -1. 
              CNTL(5) (fixation for null pivots):         0. 
              CNTL(7) (dropping parameter for BLR):       0. 
              RINFO(1) (local estimated flops for the elimination after analysis): 
                [0] 72793. 
terminate called after throwing an instance of 'std::runtime_error'
  what():  

*** -------------------------------------------------------------------------
*** DOLFIN encountered an error. If you are not able to resolve this issue
*** using the information listed below, you can ask for help at
***
***     fenics-support@googlegroups.com
***
*** Remember to include the error message listed below and, if possible,
*** include a *minimal* running example to reproduce the error.
***
*** -------------------------------------------------------------------------
*** Error:   Unable to solve linear system using PETSc Krylov solver.
*** Reason:  Solution failed to converge in 0 iterations (PETSc reason DIVERGED_PC_FAILED, residual norm ||r|| = 0.000000e+00).
*** Where:   This error was encountered inside PETScKrylovSolver.cpp.
*** Process: 1
*** 
*** DOLFIN version: 2019.1.0
*** Git changeset:  74d7efe1e84d65e9433fd96c50f1d278fa3e3f3f
*** -------------------------------------------------------------------------

dokken · May 23, 2021, 8:25pm

I think this could be related to any of the suggestions in this post:

Topic		Replies	Views
Please i need assistance on how to resolve the PETScKrylovSolver's Error below General	2	135	May 7, 2024
Krylov solver failing with MPI run / parallel Linear Algebra	4	2310	September 29, 2019
PETSc Krylov Solver RuntimeError with fine mesh I/O	21	5622	October 28, 2022
Error: Unable to solve linear system using PETSc Krylov solver Errors	1	532	August 10, 2022
Prevent the code from stopping due to a PETSc error Linear Algebra	1	827	April 7, 2019

Encountered PETSc error while running with MPI

Related topics