Robustness issue, two the same runs behave differently

Hi all,

I’ve ran into a situation where running the same code twice produces different results. Here are some browser-based logs:
https://tue019171.wtb.tue.nl/~stein/channelflowx/log-41.html
https://tue019171.wtb.tue.nl/~stein/channelflowx/log-45.html
At the top of those logs you can also download the detailed Dolfinx log files.

Upon inspection of the Dolfinx logs, you’ll find that the first few timesteps behave exactly the same for the two runs. They produce exactly the same residuals throughout their Newton iterations. Then suddenly, at line 2024-04-19 11:40:03.470 the residual jumps to -nan for one of the two runs.

Is this familiar behavior to someone?

A little about the set-up:

  • Running in a podman container
  • Dolfinx version 0.7.3
  • MPI with 4 cores
  • Using dolfinx_mpc
  • IO of an initial condition from an earlier run through adios4dolfinx (although problem occurs after a handful of steps)
  • GMRes solver with some dedicated preconditioning (also tried CG, same issue).

I don’t think I can produce a MWE since the behavior is inconsistent, and the source of the problem is exactly what I’m trying to figure out. I could shortly make the repository public though.

I know I’m not giving you much to go on, but I’m hoping someone can give me some general tips on how to figure out what might be causing this regardless.

Thanks!

Without a MWE it will be very difficult to help. Some things to check:

  • GMRES may be initialised with a random initial guess.
  • Your preconditioner matters, what are you employing?
  • Do you still get inconsistencies with a direct solver? E.g. MUMPS.
  • Are you sure you’re scattering forward any shared values when necessary? E.g. when you read in your initial condition?
  • Is your partitioner distributing cells consistently?

Thanks for the suggestions. Those are some good ideas. I’d like to dig into the details, but am not quite sure how to test some of the points you are suggesting.

  • How do I check whether GMRES is starting with a random guess?
  • I’ll check out some direct solvers.
  • How do I check cell partitioning?
  • For reading the initial condition, I simply use adios4dolfinx’s functionality:
# W being the mixed solution space
_, WV_map = W.sub(0).collapse() # DOF map from W to V
a4dx.read_function(u_output, initial_condition,"BP4") # Only the state of u matters
u_prev.x.array[WV_map] = u_output.x.array[:] # Move to solution vector in W

Should that take care of scattering shared values?

  • For preconditioning:
# W being the mixed solution space
_, WQ_map = W.sub(1).collapse() # DOF map from W to Q
_, WV_map = W.sub(0).collapse() # DOF map from W to V

# Set up the linear solver
ksp = newton_solver.krylov_solver # Handle for PETSc interaction
ksp.setType("gmres")
pc = ksp.getPC() # Preconditioning object
pc.setType("fieldsplit") # Different preconditioning for different blocks

# Split fields for applying different preconditioners
ISu = PETSc.IS().createGeneral(WV_map, domain.comm)
ISp = PETSc.IS().createGeneral(WQ_map, domain.comm)
pc.setFieldSplitIS(("u",ISu))
pc.setFieldSplitIS(("p",ISp))

# Set preconditioner options
opts = PETSc.Options()
opts.clear()
option_prefix = ksp.getOptionsPrefix()
opts[f"{option_prefix}ksp_pc_side"] = "right" # Use right preconditioning: ( K M^-1 ) M x = b. This preserves the residual.
opts[f"{option_prefix}pc_fieldsplit_type"] = "schur" # Based on schur complement of K: possible since only two blocks
opts[f"{option_prefix}pc_fieldsplit_schur_fact_type"] = "upper" # Different flavors: diag, lower, upper or full
opts[f"{option_prefix}pc_fieldsplit_schur_preconditioning"] = "selfp" # Default preconditioner for one of the schur blocks.
opts[f"{option_prefix}fieldsplit_u_ksp_type"] = "preonly" # Precondition only once in Newton iterations
opts[f"{option_prefix}fieldsplit_u_pc_type"] = "jacobi" # Jacobi (diagonals) preconditioning for u-block
opts[f"{option_prefix}fieldsplit_p_ksp_type"] = "preonly" # Precondition only once in Newton iterations
opts[f"{option_prefix}fieldsplit_p_pc_type"] = "hypre" # Hypre (algebraic multigrid) preconditioning for p-block
opts[f"{option_prefix}fieldsplit_p_pc_hypre_type"] = "boomeramg" # Choice of implementation
opts[f"{option_prefix}fieldsplit_p_pc_hypre_boomeramg_coarsen_type"] = "pmis" # Grid coarsening strategy
opts[f"{option_prefix}fieldsplit_p_pc_hypre_boomeramg_interp_type"] = "FF1" # Projection on to coarser grid
opts[f"{option_prefix}fieldsplit_p_pc_hypre_boomeramg_strong_threshold"] = "0.5" # Coarsening limit
ksp.setFromOptions()