NonlinearProblem nan tolerance

Ignatii · January 17, 2025, 1:10pm

Dear community,

I am currently working on Cahn-Hilliard simulations, using the DOLFINx Python demo as the base code. My primary modification involves changing the equation for f.
During the simulation, I encountered an issue with NaN values in the convergence rate at certain steps:

[dolfinx] [info] Newton iteration 1: r (abs) = nan (tol = 1e-10), r (rel) = nan (tol = 1e-06)

Initially, I suspected the error might be caused by negative arguments in the logarithmic function within f. To address this, I attempted the solution described in Square Root and Natural Logarithm not Working. However, even after implementing this fix, the code still failed.

The error consistently occurs at t=210. For reference, I used the latest version of the library available on Conda (v0.9.0). Below, I have attached the code for further investigation.Thank in advance for any help.

Code example:

from petsc4py import PETSc
import dolfinx
from mpi4py import MPI
import numpy as np
import ufl
from basix.ufl import element, mixed_element
from dolfinx import default_real_type, log
from dolfinx.fem import Function, functionspace
from dolfinx.fem.petsc import NonlinearProblem
from dolfinx.mesh import CellType, create_rectangle
from dolfinx.nls.petsc import NewtonSolver
from ufl import dx, grad, inner
import ufl.algebra

log.set_output_file("log.txt")


# system parameters
temp = 483
chi = 0.008756
N1 = 8500
N2 = 4600
phi1_init = 0.2
Rg = 1.13 * 10e-8
grad_coef = (chi/3) ** 0.5
M = 1

# Master parameters:
dt = 10
final_time = 3000
theta = 0.5 
mesh_resolution = 80
domain_size = 20
EPS = 1e-15


dtype = np.float64 # force typing for mesh and finite elements
msh = create_rectangle(MPI.COMM_WORLD, [(0.0, 0.0), (domain_size, domain_size)], [mesh_resolution, mesh_resolution], CellType.triangle, dtype=np.real(dtype(0)).dtype)
P1 = element("Lagrange", msh.basix_cell(), 1, dtype=np.real(dtype(0)).dtype)
ME = functionspace(msh, mixed_element([P1, P1]))

q, v = ufl.TestFunctions(ME)
u = Function(ME)
u0 = Function(ME)

phi1, mu = ufl.split(u)
phi1_0, mu_0 = ufl.split(u0)

u.x.array[:] = 0.0
rng = np.random.default_rng(13)
u.sub(0).interpolate(lambda x: phi1_init + 0.02 * (0.5 - rng.random(x.shape[1])))
u.x.scatter_forward()

phi1 = ufl.variable(phi1)
f = phi1 / N1 * ufl.ln(phi1 + EPS) + (1 - phi1) / N2 * ufl.ln(1 - phi1 + EPS) + chi * phi1 * (1 - phi1)
dfdc = ufl.diff(f, phi1)
mu_mid = (1.0 - theta) * mu_0 + theta * mu

# Weak statement of the equations
F0 = inner(phi1, q) * dx - inner(phi1_0, q) * dx + dt * M * inner(grad(mu_mid), grad(q)) * dx
F1 = inner(mu, v) * dx - inner(dfdc, v) * dx - grad_coef ** 2 * inner(grad(phi1), grad(v)) * dx
F = F0 + F1

# Create nonlinear problem and Newton solver
problem = NonlinearProblem(F, u)
solver = NewtonSolver(MPI.COMM_WORLD, problem)
solver.convergence_criterion = "residual"
solver.rtol = 1e-6
solver.max_it = 50
solver.error_on_nonconvergence = True
solver.relaxation_parameter = 1

ksp = solver.krylov_solver
opts = PETSc.Options()
option_prefix = ksp.getOptionsPrefix()
opts[f"{option_prefix}ksp_type"] = "preonly"
opts[f"{option_prefix}pc_type"] = "lu"
ksp.setFromOptions()

# Step in time
t = 0
V0, dofs = ME.sub(0).collapse()
phi1 = u.sub(0)
u0.x.array[:] = u.x.array
log.set_log_level(log.LogLevel.INFO)
while t < final_time:
    t += dt
    r = solver.solve(u)
    print(f"t' {t}: num iterations: {r[0]}")
    u0.x.array[:] = u.x.array

dokken · January 17, 2025, 2:11pm

Have you looked at the solution of the problem prior to the crash? How does the evolution look like?

Ignatii · January 17, 2025, 2:17pm

At t=200 it looks very ordinary.
ParaView:

Ignatii · January 17, 2025, 2:30pm

Also, if I set the parameters N1 and N2 to 500, the simulation seems to run fine. Is it possible that the error occurs due to too small values of function f caused by large values of N1 and N2?

dokken · January 17, 2025, 2:43pm

As you have changed f (and maybe something more), I don’t know if the method is stable or not.
Do you have your f from a specific paper?
Are there other modifications that you have made.

Ignatii · January 17, 2025, 3:23pm

This is the Flory-Huggins equation for calculating thermodynamic activity. It is highly common in polymer chemistry and there are many articles in peer-reviewed journals on the joint use of the Cahn-Hilliard and Flory-Huggins equation. For example here the authors apply the same equation for f using FEniCS version 2019.1.0. If I use similar inputs to those in the article (low N values), everything works fine.

I also tried using a custom newton solver following the example in the documentation, but only introduced a hard check for non-negativity of new values for both fields (mu and phi) and a minimum allowed value for phi. This simulation works and does not crash (with high N), but I cannot achieve high convergence values.

dokken · January 17, 2025, 3:48pm

I tried to further debug your problem with the solver options:

ksp = solver.krylov_solver
opts = PETSc.Options()
option_prefix = ksp.getOptionsPrefix()
opts[f"{option_prefix}ksp_type"] = "preonly"
opts[f"{option_prefix}pc_type"] = "lu"
opts[f"{option_prefix}pc_factor_mat_solver_type"] = "mumps"
opts[f"{option_prefix}ksp_error_if_not_converged"] = True

ksp.setFromOptions()

which crashes at the same point (I also tested, “superlu” and “petsc” as “pc_factor_mat_solver_type”), with no success. To me it seems like the problem becomes singular for the parameters that you have used above. Maybe someone else has a good idea on how to proceed further.

Ignatii · January 17, 2025, 5:31pm

Thank you for your help.
In case of using a custom solver the error is avoided but convergence quickly becomes around 1E-3 after t=200. I realize this is a very rough approach but I think it confirms that the error is with negative phi field values. I might be wrong as I am not experienced enough in FEM. The code for the solver completely repeats the example from the documentation except for the part of value checking (lines 121-127). I also tried dynamic step for newton solver (commented lines 129-130) It improved the situation a bit but didn’t fix it.

Code for custom newton:

from petsc4py import PETSc
import dolfinx
from mpi4py import MPI
import numpy as np
import ufl
from basix.ufl import element, mixed_element
from dolfinx import default_real_type, log, plot
from dolfinx.fem import Function, functionspace
from dolfinx.fem.petsc import NonlinearProblem
from dolfinx.mesh import CellType, create_unit_square, create_rectangle
from dolfinx.nls.petsc import NewtonSolver
from ufl import dx, grad, inner
import ufl.algebra

log.set_output_file("log.txt")


# system parameters
chi = 0.008756
N1 = 8500
N2 = 4600
phi1_init = 0.2
Rg = 1.13 * 10e-8
grad_coef = (chi/3) ** 0.5
M = 1
grad_coef = (chi/3) ** 0.5

# Master parameters:
dt = 10
final_time = 3000
theta = 0.5 
mesh_resolution = 80
domain_size = 20
EPS = 1e-15
decay = 0.002
min_val = 1e-15
max_val = 0.999999999999


#main code
dtype = np.float64
msh = create_rectangle(MPI.COMM_WORLD, [(0.0, 0.0), (domain_size, domain_size)], [mesh_resolution, mesh_resolution], CellType.triangle, dtype=np.real(dtype(0)).dtype)
P1 = element("Lagrange", msh.basix_cell(), 1, dtype=np.real(dtype(0)).dtype)
ME = functionspace(msh, mixed_element([P1, P1]))

# Test functions
q, v = ufl.TestFunctions(ME) # q is for Phi, v is for Mu

u = Function(ME)  # current solution
u0 = Function(ME)  # solution from previous converged step

# Split mixed functions
phi1, mu = ufl.split(u)
phi1_0, mu_0 = ufl.split(u0)

u.x.array[:] = 0.0

rng = np.random.default_rng(13)
u.sub(0).interpolate(lambda x: phi1_init + 0.02 * (0.5 - rng.random(x.shape[1])))
u.x.scatter_forward()

# Compute the chemical potential df/dc
phi1 = ufl.variable(phi1)
f = phi1 / N1 * ufl.ln(phi1 + EPS) + (1 - phi1) / N2 * ufl.ln(1 - phi1 + EPS) + chi * phi1 * (1 - phi1)
dfdc = ufl.diff(f, phi1)
mu_mid = (1.0 - theta) * mu_0 + theta * mu

# Weak statement of the equations
F0 = inner(phi1, q) * dx - inner(phi1_0, q) * dx + dt * M * inner(grad(mu_mid), grad(q)) * dx
F1 = inner(mu, v) * dx - inner(dfdc, v) * dx - grad_coef ** 2 * inner(grad(phi1), grad(v)) * dx
F = F0 + F1

residual = dolfinx.fem.form(F)
J = ufl.derivative(F, u)
jacobian = dolfinx.fem.form(J)

A = dolfinx.fem.petsc.create_matrix(jacobian)
L = dolfinx.fem.petsc.create_vector(residual)

#solver
solver = PETSc.KSP().create(msh.comm)
solver.setOperators(A)


# Get the sub-space for c and the corresponding dofs in the mixed space
# vector
V0, dofs = ME.sub(0).collapse()

phi1 = u.sub(0)
mu = u.sub(1)
u0.x.array[:] = u.x.array
t = 0
du = dolfinx.fem.Function(ME)
while t < final_time:
    it = 0
    h = 1
    for it in range(50):
        F0 = inner(phi1, q) * dx - inner(phi1_0, q) * dx + dt * M * inner(grad(mu_mid), grad(q)) * dx
        F1 = inner(mu, v) * dx - inner(dfdc, v) * dx - grad_coef ** 2 * inner(grad(phi1), grad(v)) * dx
        F = F0 + F1
        
        # Assemble Jacobian and residual
        with L.localForm() as loc_L:
            loc_L.set(0)
        A.zeroEntries()
        dolfinx.fem.petsc.assemble_matrix(A, jacobian)
        A.assemble()
        dolfinx.fem.petsc.assemble_vector(L, residual)
        L.ghostUpdate(addv=PETSc.InsertMode.ADD_VALUES, mode=PETSc.ScatterMode.REVERSE)

        # Scale residual by -1
        L.scale(-1)
        L.ghostUpdate(addv=PETSc.InsertMode.INSERT_VALUES, mode=PETSc.ScatterMode.FORWARD)

        # Solve linear problem
        solver.solve(L, du.x.petsc_vec)
        du.x.scatter_forward()

        # new values check
        for i in range(len(u.x.array[:])):
            if (u.x.array[:][i] + du.x.array[i]) <= min_val:
                du.x.array[i] = 0
                u.x.array[:][i] = min_val
            if (u.x.array[:][i] + du.x.array[i]) >= max_val:
                du.x.array[i] = 0
                u.x.array[:][i] = max_val

        # if it > 5:
        #     h /= (1 + decay * it)     
        u.x.array[:] += h * du.x.array

        # Compute norm of update
        correction_norm = du.x.petsc_vec.norm(0)
        
        if correction_norm < 1e-10:
            break
    
    t += dt
    print(f"t' {t}: num iterations: {it} Correction norm: {correction_norm}")
    
    u0.x.array[:] = u.x.array

Ignatii · January 21, 2025, 1:16pm

Little update:
I tried to implement a fix from this topic for phi1 in [0, 1]:

f = ufl.conditional(
    ufl.gt(phi1, 0), ufl.conditional(
        ufl.gt(phi1, 1), 0, phi1 / N1 * ufl.ln(phi1) + (1 - phi1) / N2 * ufl.ln(1 - phi1 ) + chi * phi1 * (1 - phi1)
        ), 
    0
    )

After this fix it stopped crushing, but now I have large negative values in phi1:

After that I tried to make a fix for phi:

phi1 = ufl.variable(
    ufl.conditional(ufl.gt(phi1, min_val), ufl.conditional(ufl.gt(phi1, 1), max_val, phi1), min_val)
    )

With conditional present at the same time for phi1 and f, the crashes came up again. So by now the only working solution is manually check values:

        for i in range(len(u.x.array[:])):
            if (u.x.array[:][i] + du.x.array[i]) <= EPS:
                du.x.array[i] = 0
                u.x.array[:][i] = EPS
            if (u.x.array[:][i] + du.x.array[i]) >= max_val:
                du.x.array[i] = 0
                u.x.array[:][i] = max_val

But in my opinion ignoring updates for finite elements looks like violations of the mass conservation law.

Stein · January 21, 2025, 4:23pm

I agree that somehow capping the values will break phase conservation.

Have you tried decreasing the timestep when the Newton solver has issues? The Cahn-Hilliard equations exhibit three different temporal regimes as it forms the interfaces (medium, fast, slow). Given the non-linearity in f there is no guarantee that the time-stepper is stable, especially in that ‘fast’ region.

A second thing, that is a mathematical curiosity but actually important, is that you’re pairing the wrong test and trial functions. The test-function associated with the phase-transport equation should actually be the one in the space of the chemical potential variable, and the testfunction associated with the ‘auxiliary equations’ (i.e., the definition of the chemical potenial equation) should be the one associated with with the phase-variable.

This is important in the proof of the energy-dissipation characteristics of the equations (= stability), and, in particular, does have an effect on the discrete system of equations through the imposition of boundary conditions (meaning to say it is not something that can be ignored).

Looking at the dolfinx tutorial, it seems like the authors were also not aware of this

Ignatii · January 21, 2025, 10:00pm

Thanks for your answer.

Looks like dynamic or smaller time step should help. As expected, the error decreases with decreasing time step. Unfortunately I can’t test on really low values as we have the calculation server under maintenance right now. In addition, the selection of the optimal ratio of grid resolution to domainsize for this problem also affects convergence. Maybe the convergence will also increase with the use of finite elements of order 2, but again there are no computational possibilities to check this in practice on a good mesh. Since very poor phase miscibility (phi ->0 and phi ->1) is expected for this problem, second order finite elements may be helpful.
You mean satisfying the Babuška–Brezzi criterion? I guess I didn’t see it when I checked the equation. Based on a small test it seems to improve the stability in my case somewhat.

So final working solution (but with bad convergence):

from petsc4py import PETSc
import dolfinx
from mpi4py import MPI
import numpy as np
import ufl
from basix.ufl import element, mixed_element
from dolfinx import default_real_type, log
from dolfinx.fem import Function, functionspace
from dolfinx.fem.petsc import NonlinearProblem
from dolfinx.mesh import CellType, create_rectangle
from dolfinx.nls.petsc import NewtonSolver
from ufl import dx, grad, inner
import ufl.algebra

log.set_output_file("log.txt")


# system parameters
temp = 483
chi = 0.008756
N1 = 8500
N2 = 4600
phi1_init = 0.2
Rg = 1.13 * 10e-8
grad_coef = (chi/3) ** 0.5
M = 1

# Master parameters:
dt = 10
final_time = 3000
theta = 0.5 
mesh_resolution = 80
domain_size = 20
EPS = 1e-15


dtype = np.float64 # force typing for mesh and finite elements
msh = create_rectangle(MPI.COMM_WORLD, [(0.0, 0.0), (domain_size, domain_size)], [mesh_resolution, mesh_resolution], CellType.triangle, dtype=np.real(dtype(0)).dtype)
P1 = element("Lagrange", msh.basix_cell(), 1, dtype=np.real(dtype(0)).dtype)
ME = functionspace(msh, mixed_element([P1, P1]))

q, v = ufl.TestFunctions(ME)
u = Function(ME)
u0 = Function(ME)

phi1, mu = ufl.split(u)
phi1_0, mu_0 = ufl.split(u0)

u.x.array[:] = 0.0
rng = np.random.default_rng(13)
u.sub(0).interpolate(lambda x: phi1_init + 0.02 * (0.5 - rng.random(x.shape[1])))
u.x.scatter_forward()

phi1 = ufl.variable(phi1)
f = ufl.conditional(
    ufl.gt(phi1, 0), ufl.conditional(
        ufl.gt(phi1, 1), 0, phi1 / N1 * ufl.ln(phi1) + (1 - phi1) / N2 * ufl.ln(1 - phi1 ) + chi * phi1 * (1 - phi1)
        ), 
    0
    )

dfdc = ufl.diff(f, phi1)
mu_mid = (1.0 - theta) * mu_0 + theta * mu

# Weak statement of the equations
F0 = inner(phi1, v) * dx - inner(phi1_0, v) * dx + dt * M * inner(grad(mu_mid), grad(v)) * dx
F1 = inner(mu, q) * dx - inner(dfdc, q) * dx - grad_coef ** 2 * inner(grad(phi1), grad(q)) * dx
F = F0 + F1

# Create nonlinear problem and Newton solver
problem = NonlinearProblem(F, u)
solver = NewtonSolver(MPI.COMM_WORLD, problem)
solver.convergence_criterion = "residual"
solver.rtol = 1e-6
solver.max_it = 50
solver.error_on_nonconvergence = False
solver.relaxation_parameter = 1

ksp = solver.krylov_solver
opts = PETSc.Options()
option_prefix = ksp.getOptionsPrefix()
opts[f"{option_prefix}ksp_type"] = "preonly"
opts[f"{option_prefix}pc_type"] = "lu"
ksp.setFromOptions()

# Step in time
t = 0
V0, dofs = ME.sub(0).collapse()
phi1 = u.sub(0)
u0.x.array[:] = u.x.array
log.set_log_level(log.LogLevel.INFO)
while t < final_time:
    t += dt
    r = solver.solve(u)
    print(f"t' {t}: num iterations: {r[0]}")
    u0.x.array[:] = u.x.array

Good luck to researchers in the field of polymer chemistry. I don’t know if the topics are indexed here, but I will try to add keywords for search: fenics, cahn hilliard, flory huggins, polymer.

Stein · January 22, 2025, 8:14am

2.: Maybe this is also important for inf-sup stability, but that’s not quite what I meant. In terms of proving stability for non-linear PDEs, an energy-dissipation is typically the best you can do. To prove this for Cahn-Hilliard, you would need to pick q=d/dt phi and v=mu (in the notation of your last code snippet). That is only possible if q lives in the discrete space of phi and v in the discrete space of mu.
If you’re interested, then take a look at the appendix in Redirecting . Specifically section A2 (equations A9-A11).

Topic		Replies	Views
Unexpected Error/Deviation from Dolfinx simulation General	2	272	July 14, 2023
Newton solver residual returns NaN after 2 iterations Legacy DOLFIN	4	344	February 16, 2024
Residual is nan General dolfinx	2	45	March 6, 2025
Writing XDMF file leads to inf & nan values dolfinx	4	281	March 6, 2024
Solving non-newtonian fluid	2	1359	May 8, 2019

NonlinearProblem nan tolerance

Related topics