Incompressible (Steady) Navier Stokes equations with FEniCS - Preconditioners

I have a (rather general) question regarding the solution of the incompressible, steady state Navier Stokes equations with FEniCS.
Just to be on the same page, when I talk about these equations I am referring to

-\mu \Delta u + \rho (u\cdot \nabla)u + \nabla p = f \\ \nabla \cdot u =0

which is solved for (u,p) (velocity and pressure).

What I am looking for is advice or general tips for solving three-dimensional problems. There, I have faced several problems when solving the corresponding linearized systems (either from Newton’s method or using a Oseen iteration). Basically, I would like to have a (somewhat) scaleable solver for these equations.

I have already considered quite a number of Books, including e.g. “Elman, Sylvester, Wathen -Finite Elements and Fast Iterative Solvers with Applications in Incompressible Fluid Dynamics” or “John - Finite Element Methods for Incompressible Flow Problems”. So I am quite familiar with the topic.

What I am struggling with is the saddle-point nature of the problem together with the convection term. Let me explain. To solve this problem, I am either employing LBB-stable Taylor-Hood elements or I use stabilized equal order elements (P1-P1). Then, I am having issues iteratively solving this problem. As the system is unsymmetric, I am employing a restarted GMRES method. Without preconditioning, the convergence of course deteriorates. As usual, a good preconditioner is required to solve the (linearized) system efficiently.

So far, I have been trying to use PETSc’s fieldsplit preconditioner to derive preconditioners based on the Schur complement of the problem. There is quite a nice discussion on this for firedrake, which I will link here Preconditioning saddle-point systems — Firedrake 0+unknown documentation
With this, I have ended up using the following preconditioner options in PETSc:

ksp_options = {
    "ksp_type": "fgmres",
    "ksp_pc_side": "right",
    "ksp_gmres_restart": 100,
    "ksp_gmres_cgs_refinement_type": "refine_ifneeded",
    "ksp_max_it": 1000,
    "ksp_rtol": 1e-10,
    "ksp_atol": 1e-30,
    "pc_type": "fieldsplit",
    "pc_fieldsplit_type": "schur",
    "pc_fieldsplit_schur_fact_type": "full",
    "pc_fieldsplit_schur_precondition": "selfp",
    "fieldsplit_0_ksp_type": "preonly",
    "fieldsplit_0_pc_type": "hypre",
    "fieldsplit_0_pc_hypre_type": "boomeramg",
    "fieldsplit_0_pc_hypre_boomeramg_relax_weight_all": 0.0,
    "fieldsplit_1_ksp_type": "preonly",
    "fieldsplit_1_pc_type": "hypre",
    "fieldsplit_1_pc_hypre_type": "boomeramg",
    "fieldsplit_1_pc_hypre_boomeramg_relax_weight_all": 0.0,
}

For the linear Stokes system, using these options is totally fine and works good (as does preconditioning with the pressure mass matrix). However, once I am considering the Navier-Stokes equations, the convective term leads to a deterioration of the performance. I assume that this comes from the “selfp” preconditioning option for the Schur complement, where B~ \text{diag}(A)^{-1}~ B^T is used to approximate the Schur complement operator. And, of course, \text{diag}(A)^{-1} is a bad approximation of A^{-1} when the convective term becomes large. Moreover, the approach shows a clear mesh dependence, so that when the problem size roughly doubles, then also the amount if iterations of the outer “fgmres” solver doubles. I would hope to find an approach which scales better.

Does anyone have some hints or strategies that could be used to improve the linear solver’s convergence and to make it somewhat scalable? And also, are there any guidelines, papers, rules of thumb on how the tolerances on the inner solvers for the Schur complement could be chosen? At the moment, I mostly emply the above approach of only using the preconditioners for “fieldsplit_0” and “fieldsplit_1”, but, of course, iterative solvers could be used in there, too.

Here are some remarks on what I have also tried to solve the equations (and the reasons why these approaches do not work so well for me).

  • Using the LSC preconditioner in PETSc - From the literature, this should work and scale well, but I have only observed subpar performance. There were some changes to this with PETSc 3.20, but I did not find that they improved on any of the issues I was having.
  • Using a pressure-convection-diffusion type preconditioner (Elman, Sylvester, Wathen) - I am aware of the package fenapack which implements these preconditioners. However, I had trouble reproducing their scalability for my test problems. Moreover, the approach is not algebraic and I want to avoid this.
  • Using other multigrid packages. So far I have relied on HYPRE’s boomeramg, which gave me really great results for the Poisson equation and convection diffusion equation (there, the preconditioners scaled perfectly and gave text-book performance). I found that using the PETSc “gamg” algebraic multigrid to be significantly slower.
  • Using a pseudo time-stepping for computing the steady-state limit. Time discretization makes the system more well-behaved due to the presence of a velocity mass matrix (after discretization). However, to make use of this, the time steps that I could use were restrictively small. I mainly used the implicit Euler method to do this due to its stability.
  • Using decoupled approaches for pseudo time stepping (mainly pressure correction schemes). These suffered from the fact that they were not stable for \delta t \to +\infty, so that again a time step restriction applied, rendering the methods ineffective.
  • So far, the best alternative I came up with is using a pseudo compressibility approach with pseudo time stepping for the steady state. This replaces the continuity equation with a term of the form \varepsilon \partial_t p + \nabla \cdot u = 0, so that in the steady state limit the incompressible solution is recovered (see e.g. https://doi.org/10.1137/140975231). This approach can be used to decouple velocity and pressure, however, a term similar to a grad-div stabilization is then added to the momentum equations which scales with \frac{1}{\varepsilon}, which makes the iterative solution of this system hard for \varepsilon \to 0. Hence, time steps cannot be chosen arbitrarily large, or \varepsilon \approx \Delta t, so that very large values of \varepsilon have to be chosen when using a large time step (so that many iterations are required).
  • Finally, I have noticed that stabilized formulations (using e.g. SUPG or PSPG) seem to be significantly better for using the AMG preconditioners and that they improve their performance. However, these do not automatically make the solvers scalable.

I know that solving the Navier-Stokes is hard, but I still assume that there are some tricks / methods that I might have missed and which could perform significantly better. So I am thankful for any inputs on this problem and grateful for any hints on how you solve these equations for complex 3D problems with FEniCS.

Thanks a lot in advance,
Sebastian