How to choose the optimal solver for a PDE problem?

Hi everyone,
I’m bringing up a challenging but pragmatic question. How to choose the best solver for a given problem? At least, considering the FEniCS framework or its default linear algebra backend, PETSc.

As a beginner, I’m asking this because I often spend much time finding the optimal solver for my problems (given the mesh, the set of PDEs, the parameters, the discretization, and so on) through trial and error. What I learned from basic theory (like this book) is the general rule of thumb:

  • direct solvers (e.g. lu, mumps) → problems on small/medium 2D domains
  • iterative solvers (e.g. gmres) → problems on medium/large 2D and 3D domains

So, given the dimension of the problem, I simply try all the solvers available of the appropriate type (direct/iterative), and I select the fastest among those which converge.

This approach also seems supported by Matthew Knepley (PETSc developer), who in this talk clearly states that you never know which is the best solver for your problem.

Still, I was wondering if this is the most efficient way. So, rephrasing my main questions in more specific (and bearable) subquestions:

  • Do you think trial and error is the best way to find the optimal solver for a problem?
  • Are there valuable rules of thumb that I can use to select the optimal solver a priori or, at least, select a subset in advance?
  • Are there any references you’d recommend to me about this topic?

Thank you very much.

4 Likes

This is a difficult question to answer objectively for the reasons outlined in Matt’s talk that you’ve cited. I’ll write my opinion here based on my past work as a consumer of computational linear algebra research and not a contributor.

I’ll consider two specific methods here: stationary (direct factorisation) and iterative (preconditioned Krylov subspace search), each in the context of the d- dimensional finite element discretisation of the Laplace operator defined on a mesh yielding n degrees of freedom:

  • Direct solution (e.g. mumps, superlu_dist):

    • Complexity \mathcal{O}(n^{\frac{3}{2}}) in 2D and \mathcal{O}(n^{2}) in 3D.
    • Robust. Unless your system is singular, you will arrive at the solution.
    • Precise. The solution of the linear system is computed to machine pecision.
    • Fast (for small problems). The growth constant is typically smaller than iterative solvers. If you have a small problem, you will likely solve that problem faster with a direct solver than an iterative solver.
    • “Black box”. Very few parameters need to be tuned to arrive at a solution.
  • Algebraic multigrid preconditioned Krylov iterative method:

    • Complexity \mathcal{O}(n).
    • The correct algorithm to employ depends on your discretised operator: Is it hermitian? Is it positive-definite / semi-definite / indefinite?
    • Designing a preconditioner can be difficult. Are there orders of magnitude difference in the problems’ material coefficients? How does one capture geometric effects arising in algebraic systems? How best to coarsen and refine the problem for a hierarchy of grids?
    • Convergence rate depends on mesh quality.
    • Many parameters to consider: rate of coarsening, paritioning, which solvers to employ on which levels, the cycle pattern to be employed…
    • The computed solution is an approximation of the true linear algebra solution. You are introducing a new numerical error, in addition to finite element approximation error, by imprecisely solving the linear system. Typically more computational effort is required for a more precise approximation of the linear system’s solution.

With the pros and cons of the above problems in place, what should we use in modern finite element analysis? Clearly direct solvers have earned their popularity. They are fast and existing packages are easy to use. But there’s a problem…

Although a solver may perform exceptionally well for a problem with n \backsim 10^5 degrees of freedom, what if our scientific application demands the fidelity only offered by a problem with n \geq 10^9 degrees of freedom? For these large problems, even if the growth constant is small, the growth rates of \mathcal{O}(n^\frac{3}{2}) and \mathcal{O}(n^2) cannot be considered merely a disadvantage. They ensure that computing a solution of a large system with a direct method is impossible in reasonable time.

Therefore I’m not solely concerned with marginal performance improvements offered by spending hours tweaking iterative/direct solver schemes by trial and error. What’s crucially important in modern computational simulations is scaling. One must be very careful that the relationship between problem size and time-to-solution is optimal (i.e. linear). We therefore rely on iterative methods, and are humbled by their development by very talented numericists in the literature.

19 Likes

Dear Nate,
Thank you for sharing this excellent piece of information. You summarized what I could not find in days of Googling in a few lines. I hope this can also help others that may feel confused by the complexity of solvers and their behavior.

1 Like