For scalable nonlinear solvers, you could take a look at this paper as a starting point. In general it can pin down to be highly specific to the problem you are solving.
Newton combined with a proper choice of linear solver (at each Newton-iteration) and a preconditioner should usually give you out of the box good performance for many. PETScSNES is also a good place to start looking out for solvers available directly through PETSc. You could also take a look at Non linear solver to program the solver by yourself.
https://github.com/FEniCS/dolfinx/blob/master/python/test/unit/nls/test_newton.py and a corresponding dolfin-equivalent Set krylov linear solver paramters in newton solver are also good pointers on how to setup your problem/solver and preconditioner.
My personal experience: Newton with a line search and a good preconditioner (e.g., AMG for hyperelasticity) should be pretty robust and scale well for larger problems. You can use CG as the Krylov solver if your system is SPD. Direct solvers (even MUMPS, SUPERLU_DIST) will run out of memory for very large problems, and therefore unsuitable. For many problems, however, Newton is not a good choice and you have to go with a gradient-descent approach which has the disadvantage of being horribly slow.
Yet another alternative for Newton: http://www.lmm.jussieu.fr/~corrado/articles/cm-a16-FarMau.pdf, for variational phase-field, which is also a highly nonlinear and non-convex problem.