Hi, I encountered the following error while running a multi-level monte carlo sampling. I am using dolfin solver c++. Each mpi process will run a single process PETScKrylovSolver using mumps and sample with different coefficient repetitively.
However, in occasions, I encounter the following error. And I could not reproduce it as if I run the solver for that particular failed coefficient alone with one process, I got a good solution.
I’m clueless now, not even sure where to look into for more info about what’s going on. I tried to output the ksp solver information, but thing seems all right before the error. Any recommendations on how to debug is appreciated.
Process 0: *** Warning: Verbose output for PETScKrylovSolver not implemented, calling PETSc KSPView directly.
KSP Object: 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: 1 MPI processes
type: cholesky
out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 0., needed 0.
Factored matrix follows:
Mat Object: 1 MPI processes
type: mumps
rows=240, cols=240
package used to perform factorization: mumps
total: nonzeros=3765, allocated nonzeros=3765
total number of mallocs used during MatSetValues calls =0
MUMPS run parameters:
SYM (matrix type): 2
PAR (host participation): 1
ICNTL(1) (output for error): 6
ICNTL(2) (output of diagnostic msg): 0
ICNTL(3) (output for global info): 0
ICNTL(4) (level of printing): 0
ICNTL(5) (input mat struct): 0
ICNTL(6) (matrix prescaling): 7
ICNTL(7) (sequential matrix ordering):2
ICNTL(8) (scaling strategy): 77
ICNTL(10) (max num of refinements): 0
ICNTL(11) (error analysis): 0
ICNTL(12) (efficiency control): 0
ICNTL(13) (efficiency control): 1
ICNTL(14) (percentage of estimated workspace increase): 20
ICNTL(18) (input mat struct): 0
ICNTL(19) (Schur complement info): 0
ICNTL(20) (rhs sparse pattern): 0
ICNTL(21) (solution struct): 0
ICNTL(22) (in-core/out-of-core facility): 0
ICNTL(23) (max size of memory can be allocated locally):0
ICNTL(24) (detection of null pivot rows): 1
ICNTL(25) (computation of a null space basis): 0
ICNTL(26) (Schur options for rhs or solution): 0
ICNTL(27) (experimental parameter): -32
ICNTL(28) (use parallel or sequential ordering): 1
ICNTL(29) (parallel ordering): 0
ICNTL(30) (user-specified set of entries in inv(A)): 0
ICNTL(31) (factors is discarded in the solve phase): 0
ICNTL(33) (compute determinant): 0
ICNTL(35) (activate BLR based factorization): 0
CNTL(1) (relative pivoting threshold): 0.01
CNTL(2) (stopping criterion of refinement): 1.49012e-08
CNTL(3) (absolute pivoting threshold): 1e-08
CNTL(4) (value of static pivoting): -1.
CNTL(5) (fixation for null pivots): 0.
CNTL(7) (dropping parameter for BLR): 0.
RINFO(1) (local estimated flops for the elimination after analysis):
[0] 72793.
terminate called after throwing an instance of 'std::runtime_error'
what():
*** -------------------------------------------------------------------------
*** DOLFIN encountered an error. If you are not able to resolve this issue
*** using the information listed below, you can ask for help at
***
*** fenics-support@googlegroups.com
***
*** Remember to include the error message listed below and, if possible,
*** include a *minimal* running example to reproduce the error.
***
*** -------------------------------------------------------------------------
*** Error: Unable to solve linear system using PETSc Krylov solver.
*** Reason: Solution failed to converge in 0 iterations (PETSc reason DIVERGED_PC_FAILED, residual norm ||r|| = 0.000000e+00).
*** Where: This error was encountered inside PETScKrylovSolver.cpp.
*** Process: 1
***
*** DOLFIN version: 2019.1.0
*** Git changeset: 74d7efe1e84d65e9433fd96c50f1d278fa3e3f3f
*** -------------------------------------------------------------------------