Currently, I’m addressing the Stokes problem and observing a convergence rate of around 3 in the (L2 norm) for velocity. Considering I’m employing P2 for velocity and P1 for pressure, what would be the optimal convergence rate for the Stokes problem? Shouldn’t I expect a convergence rate closer to 2, given the use of P2 for velocity?

Given a Stokes problem posed with sufficiently smooth analytical solution and material cofficients discrectised on a mesh of granularity measured to be h, the standard Taylor-Hood element of degree \ell in the velocity approximation and \ell-1 in the pressure approximation would yield expected approximation error convergence rates of:

Thank you for your response. Do you have any articles or resources that can provide detailed explanations regarding why the convergence rates should be close to 2 or 3, as you mentioned?

As @dokken has already mentioned, this result derives from standard a priori error analysis. You should be able to find it in any standard introductory FEM text, or lecture notes.