Hi everyone,
I am developing a FEM solution of the parabolic 2D two-temperature model (TTM) composed of a system of second-order non-linear coupled PDEs (cf. eq.5 here with an additional lattice heat flux term under Fourier’s law).
The Lipschitz continuous domain is \Omega = (0, x_w) \times (0, z_t) \subset \mathbb{R}_+^2 and \partial \Omega = \Gamma_D with x_w and z_t denoting the film’s width and thickness, respectively.
The key challenge is that C_e, G, and k_e are strongly dependent on the electron temperature T_e and thereafter we are tasked to handle these non-linearities. Specifically, C_e and G are fitted by a piecewise polynomial, which in our temperature range, C_e is linear and G is constant. k_e(T_e, T_l) = k_{e0}\cdot(B_l T_e)/(A_eT_e^2 + B_lT_l).
Note that the volumetric heat source, S = \alpha(1-R)I_{ulp}, is also temperature dependent from the absorption \alpha(T_e,T_l), and reflectivity R(T_e,T_l), which follow the Drude-Lorentz model. I_{ulp}(x, z, t; T_e, T_l) = G(t)\exp\{-(x^2 + z^2)/R_0^2\}\exp\{-\alpha(T_e,T_l)(z-z_s)\} is the ultrashot laser pulse (ULP) intensity expressed as a multiplication of a temporal Gaussian envelope, spatial Gaussian kernel and Beer-Lambert decay. I describe the simplified set-up here to have a grasp on the problem and stiffness/instability issues that might arise.
In the dimensionless notation we have (\tau = t/T, \ \vec{\xi} = (\xi_x, \xi_z) = (x/L_x, z/L_z), \ S^* = s):
where \Omega \mapsto (0, x_w/L_x) \times (0, z_t/L_z) \equiv (0, \xi_w) \times (0, \xi_t).
Below I provide the variational formulation/problem for completeness and to ensure its validity.
Find (\theta_e, \theta_l) \in V \times V such that for all \vec{v} = (v_e,v_l) \in V'^2,
- Electron residual R_e(\theta_e, \theta_l; v_e) equals zero:
- Lattice residual R_l(\theta_e, \theta_l; v_l) equals zero:
The implementation follows highly the Cahn-Hilliard equation example due to its commonalities with the current system.
I use
fenics-dolfinx=0.8.0
which supports complex PETSc (for future coupling with Maxwell-Heaviside equations),petsc4py=3.21.1=complex
,petsc=3.21.6=complex
. Pyvista is installed (pyvista=0.44.1
,pyvistaqt=0.11.2
) but not used here.
Below are the necessary reproducible parts of the code. Assume the retrieved physical quantities and scaling are properly obtained and applied/handled. I fallback to the case of C_e = \gamma T_e, k_e = k_{e0}, and G= G_0 for debugging (constant diffusivity), with the constants provided in code. This version is called linear TTM in literature, despite the heat source holding non-linearities.
# necessary imports here. Note that PETSc.ScalarType is numpy.complex128
# arguments of ScalingFactors subclassed by TTM2DNonLinear
L_scale=1e-6, # 1000 nm, characteristic length
t_scale=1e-12, # 1 ps, characteristic time
Te_scale=1000.0, # 1000 K, characteristic temperature
S_scale=1e16 # 1e16 W/m^3, characteristic source
# let object sim_2d_nonlinear, instantiated by TTM2DNonLinear
Ce_func = sim_2d_nonlinear.Ce_func
G_func = sim_2d_nonlinear.G_func
ke_func = sim_2d_nonlinear.ke_func
S_func = sim_2d_nonlinear.S_func
xi_w = width/L_scale
xi_t = thickness/L_scale
Nx, Nz = 101, 101
msh = create_rectangle(MPI.COMM_WORLD, ((0,0), (xi_w, xi_t)), (Nx, Nz), CellType.triangle)
P1 = element("Lagrange", msh.basix_cell(), 1)
ME = functionspace(msh, mixed_element([P1, P1]))
# ---------- test / trial / solution functions ----------
(ve, vl) = ufl.TestFunctions(ME)
w = Function(ME) # current (theta_e, theta_l)
w0 = Function(ME) # previous timestep (theta_e^n, theta_l^n)
theta_e, theta_l = ufl.split(w)
theta_e0, theta_l0 = ufl.split(w0)
# initialize previous time step to scaled initial temperature
theta_init = sim_2d_nonlinear.sim_params.initial_temp / sim_2d_nonlinear.scaling_factors.Te_scale
w0.sub(0).interpolate(lambda x: np.full(x.shape[1], theta_init))
w0.sub(1).interpolate(lambda x: np.full(x.shape[1], theta_init))
w0.x.scatter_forward()
# ---------- scalar FE space for coefficient fields ----------
V = functionspace(msh, P1)
Ce_fun = Function(V) # C_e^*(theta_e)
ke_fun = Function(V) # k_e^*(theta_e, theta_l)
G_fun = Function(V) # G^*(theta_e)
S_fun = Function(V) # s(...)
# ---- Extraction + coefficient evaluation + assignment ----
# Use the collapsed dof maps
V0, dofs_e = ME.sub(0).collapse()
V1, dofs_l = ME.sub(1).collapse()
# scalar DOF count from V, coords shape: (n_scalar_dofs, msh.geometry.dim)
coords = np.asarray(V.tabulate_dof_coordinates())
if coords.ndim == 1:
coords = coords.reshape((-1, msh.geometry.dim))
if coords.shape[1] > msh.geometry.dim: # this holds in our case
coords_temp = coords[:, :msh.geometry.dim]
n_scalar_dofs = coords.shape[0]
# subfunction views
theta_e_sub = w0.sub(0)
theta_l_sub = w0.sub(1)
# get_scalar_nodal_array() routine that returns a 1D numpy array of scalar nodal values
# aligned with V ordering (length n_scalar_dofs).
# Handles two cases:
# 1) subfun.x.array already has length n_scalar_dofs -> return that.
# 2) subfun.x.array is mixed-length -> extract using dofs_map from mixed_array.
# extract scalar-ordered nodal temperature arrays
#(both shape (10404,) as n_scalar_dofs = 10404)
theta_e_vals = get_scalar_nodal_array(theta_e_sub, w0.x.array, dofs_e, name="theta_e")
theta_l_vals = get_scalar_nodal_array(theta_l_sub, w0.x.array, dofs_l, name="theta_l")
# real parts for coefficient evaluation
theta_e_real = np.real(theta_e_vals)
theta_l_real = np.real(theta_l_vals)
# coords aligned with scalar V ordering
coords_scaled = (coords[:, 0] / sim_2d_nonlinear.scaling_factors.L_scale,
coords[:, 1] / sim_2d_nonlinear.scaling_factors.L_scale)
### FALLBACK TO LINEAR IMPLEMENTATION ###
# linear Ce (Ce = gamma * Te) constant ke and G
gamma = 68.0 # J / m^3 / K^2
ke0 = 318.0 # W / (m K)
G0 = 2.1e16 # W / (m^3 K)
# theta_e_vals, theta_l_vals are the dimensionless nodal arrays
scaler = sim_2d_nonlinear.scaler
Te_phys = scaler.unscale_T(theta_e_vals)
Tl_phys = scaler.unscale_T(theta_l_vals)
Ce_phys = gamma * Te_phys
Ce_vals = np.asarray(scaler.C(Ce_phys), dtype=PETSc.ScalarType).reshape(-1)
ke_vals = np.full(n_scalar_dofs, scaler.k(ke0), dtype=PETSc.ScalarType).reshape(-1)
G_vals = np.full(n_scalar_dofs, scaler.G(G0), dtype=PETSc.ScalarType).reshape(-1)
t_scaled = 0.0
S_vals = np.asarray(sim_2d_nonlinear.S_func(coords_scaled, t_scaled,
theta_e_vals, theta_l_vals), dtype=PETSc.ScalarType).reshape(-1)
# Assign into scalar coefficient Functions
Ce_fun.x.array[:] = Ce_vals
Ce_fun.x.scatter_forward()
G_fun.x.array[:] = G_vals
G_fun.x.scatter_forward()
ke_fun.x.array[:] = ke_vals
ke_fun.x.scatter_forward()
S_fun.x.array[:] = S_vals
S_fun.x.scatter_forward()
In the \mu-weighted mid-values build, I assumed backward Euler for better stability.
mu = 1.0
theta_e_mid = (1.0 - mu) * theta_e0 + mu * theta_e
theta_l_mid = (1.0 - mu) * theta_l0 + mu * theta_l
invCe = PETSc.ScalarType(1.0) / Ce_fun # UFL Function
grad_invCe = ufl.grad(invCe) # ∇(1/Ce^*)
Nt = 1_001
T_phys = float(sim_2d_nonlinear.sim_params.evolution_time)
dt_phys = T_phys / (Nt - 1)
dt = scaler.t(dt_phys)
Here, I have strong doubts on the weak residuals’ construction. The combination with inner
I tried in previous versions didn’t work.
# ---------- lattice diffusion coefficient (constant scaled k_l^*/C_l^*) ----------
props = sim_2d_nonlinear.material_handler.get_scaled_properties(sim_2d_nonlinear.scaler)
kl_over_Cl = float(props["kl_star"] / props["Cl_star"])
Cl = PETSc.ScalarType(props["Cl_star"])
ve_c, vl_c = ufl.conj(ve), ufl.conj(vl)
# Electron residual R_e (test function ve)
F_e = (
(theta_e - theta_e0) / dt * ve_c * ufl.dx
+ (ke_fun * invCe) * ufl.dot(ufl.grad(theta_e_mid), ufl.grad(ve_c)) * ufl.dx
+ ke_fun * ve_c * ufl.dot(ufl.grad(theta_e_mid), grad_invCe) * ufl.dx
+ (G_fun * invCe) * (theta_e_mid - theta_l_mid) * ve_c * ufl.dx
- (S_fun * invCe) * ve_c * ufl.dx
)
# Lattice residual R_l (test function vl)
F_l = (
(theta_l - theta_l0) / dt * vl_c * ufl.dx
+ kl_over_Cl * ufl.dot(ufl.grad(theta_l_mid), ufl.grad(vl_c)) * ufl.dx
- (G_fun / Cl) * (theta_e_mid - theta_l_mid) * vl_c * ufl.dx
)
F = F_e + F_l
# locate dofs geometrically with tolerance 1e-12 for safety.
def on_boundary(x: np.array) -> bool: pass
# Boundary value (real, assuming dolfinx will convert to complex if needed)
theta_D = float(sim_2d_nonlinear.sim_params.initial_temp / sim_2d_nonlinear.scaling_factors.Te_scale)
# function interpolation method for BCs
g_e = Function(V0)
g_l = Function(V1)
# Interpolate the constant boundary value
g_e.interpolate(lambda x: np.full(x.shape[1], theta_D))
g_l.interpolate(lambda x: np.full(x.shape[1], theta_D))
# Locate DOFs using only the collapsed spaces (avoid mesh mismatch)
dofs_e_collapsed = locate_dofs_geometrical(V0, on_boundary)
dofs_l_collapsed = locate_dofs_geometrical(V1, on_boundary)
# Define boundary conditions
bc_e = dirichletbc(g_e, dofs_e_collapsed)
bc_l = dirichletbc(g_l, dofs_l_collapsed)
bcs = [bc_e, bc_l]
Configuration of linear (KSP) solver options and MUMPS selection.
sys, opts = PETSc.Sys(), PETSc.Options()
# Global KSP/PC options
opts["ksp_type"], opts["pc_type"] = "preonly", "lu"
opts["pc_factor_mat_solver_type"] = "mumps"
opts["pc_factor_mumps_icntl_14"] = "50"
Configuring SNES options in PETSc options DB (optional helper wrapper). Alternative snes_linesearch_type
tested: "l2"
, "basic"
.
Also tested with line-search control opts["snes_linesearch_maxstep"] = 1.0
and trust-region method ("newtontr"
to snes_type
with snes_tr_tol
of "1e-12"
).
def configure_snes_options(prefix="ttm_"):
opts = PETSc.Options()
# SNES nonlinear solver options
opts[f"{prefix}snes_type"] = "newtonls"
opts["snes_linesearch_type"] = "bt"
opts[f"{prefix}snes_max_it"] = "50"
opts[f"{prefix}snes_atol"] = "1e-10"
opts[f"{prefix}snes_rtol"] = "0"
stol_val = str(np.sqrt(np.finfo(default_real_type).eps) * 1e-2)
opts[f"{prefix}snes_stol"] = stol_val
# Monitoring options
opts[f"{prefix}snes_monitor"] = ""
opts[f"{prefix}snes_converged_reason"] = ""
return opts
_ = configure_snes_options("ttm_")
# Build Jacobian, create NonlinearProblem and NewtonSolver
dw = ufl.TrialFunction(ME)
J = ufl.derivative(F, w, dw)
problem = NonlinearProblem(F, w, bcs, J)
solver = NewtonSolver(MPI.COMM_WORLD, problem)
# --- Enhanced SNES configuration ---
solver.convergence_criterion = "incremental"
# relative tol on increment (stol-like)
solver.rtol = np.sqrt(np.finfo(default_real_type).eps) * 1e-2
solver.atol = 1e-10 # Absolute tolerance
solver.max_it = 50 # Maximum iterations
solver.report = True
# Apply PETSc options to the KSP
ksp = solver.krylov_solver
ksp.setFromOptions()
# Helper: extract scalar-ordered nodal arrays from mixed w.
# Used on the nonlinear case.
def mixed_to_scalar_arrays(mixed_vec, dofs_map, n_scalar_dofs):
"""Return a numpy array length n_scalar_dofs containing values for the collapsed scalar space ordering."""
pass
Finally, the TTM time-stepping loop handling the simplified TTM. I kept the comments showing how I am approaching the nonlinear problem.
# create reusable buffers
G_buf = np.empty(n_scalar_dofs, dtype=PETSc.ScalarType)
ke_buf = np.empty(n_scalar_dofs, dtype=PETSc.ScalarType)
Ce_buf = np.empty(n_scalar_dofs, dtype=PETSc.ScalarType)
S_buf = np.empty(n_scalar_dofs, dtype=PETSc.ScalarType)
# Assign into dolfinx Functions (initial state)
G_buf[:] = np.asarray(G_vals, dtype=PETSc.ScalarType)
G_fun.x.array[:] = G_buf
G_fun.x.scatter_forward()
ke_buf[:] = np.asarray(ke_vals, dtype=PETSc.ScalarType)
ke_fun.x.array[:] = ke_buf
ke_fun.x.scatter_forward()
# Time-stepping loop
t_phys = 0.0
t_scaled = sim_2d_nonlinear.scaler.t(t_phys)
step = 0
start_time = time.time()
I evaluate Ce
implicitly (at w0
) and treat it as a constant coefficient during the current Newton solve to keep the general variational form and Jacobian unchanged for the nonlinear case.
for step in range(1, Nt):
# advance physical time
t_phys = step * dt_phys
t_scaled = sim_2d_nonlinear.scaler.t(t_phys)
# initial guess = previous converged solution
w.x.array[:] = w0.x.array[:]
w.x.scatter_forward()
# -------------------------
# Semi-implicit: update Ce from previous timestep (w0)
# -------------------------
theta_e_prev = mixed_to_scalar_arrays(w0.x.array, dofs_e, n_scalar_dofs)
Te_phys_prev = scaler.unscale_T(theta_e_prev)
Ce_phys_vals = gamma * Te_phys_prev
Ce_buf[:] = np.asarray(scaler.C(Ce_phys_vals), dtype=PETSc.ScalarType)
# push to dolfinx Function
Ce_fun.x.array[:] = Ce_buf; Ce_fun.x.scatter_forward()
# Reassign constants ke and G to Functions:
G_fun.x.array[:] = G_buf; G_fun.x.scatter_forward()
ke_fun.x.array[:] = ke_buf; ke_fun.x.scatter_forward()
# -------------------------
# NONLINEAR CASE
# # compute theta_e_vals, theta_l_vals (real numpy arrays)
# theta_e_vals = mixed_to_scalar_arrays(w.x.array, dofs_e, n_scalar_dofs)
# theta_l_vals = mixed_to_scalar_arrays(w.x.array, dofs_l, n_scalar_dofs)
# # compute coefficient values as float64 arrays (vectorized)
# Ce_vals_f = np.asarray(sim_2d_nonlinear.Ce_func(theta_e_vals), dtype=float)
# G_vals_f = np.asarray(sim_2d_nonlinear.G_func(theta_e_vals), dtype=float)
# ke_vals_f = np.asarray(sim_2d_nonlinear.ke_func(theta_e_vals, theta_l_vals), dtype=float)
S_vals_f = np.asarray(sim_2d_nonlinear.S_func(coords_scaled, t_scaled, theta_e_vals, theta_l_vals), dtype=float)
# # in-place cast into PETSc dtype buffers
# Ce_buf[:] = Ce_vals_f.astype(PETSc.ScalarType)
# G_buf[:] = G_vals_f.astype(PETSc.ScalarType)
# ke_buf[:] = ke_vals_f.astype(PETSc.ScalarType)
S_buf[:] = S_vals_f.astype(PETSc.ScalarType)
# # assign into dolfinx Functions and scatter once each
# Ce_fun.x.array[:] = Ce_buf
# Ce_fun.x.scatter_forward()
# G_fun.x.array[:] = G_buf
# G_fun.x.scatter_forward()
# ke_fun.x.array[:] = ke_buf
# ke_fun.x.scatter_forward()
S_fun.x.array[:] = S_buf; S_fun.x.scatter_forward()
# ---- Solve nonlinear system for this time-step ----
try:
num_it, converged = solver.solve(w)
except Exception as e:
print(f"Newton solver threw an exception at step {step}: {e}")
converged = False
num_it = -1
if not converged:
print(f"WARNING: Newton did not converge at step {step}, time {t_phys:.3e}. Aborting time stepping.")
break
print(f"Step {step:4d}/{Nt-1:4d} t={t_phys:.3e} Newton its={num_it}")
# ---- Update previous solution and write outputs ----
w0.x.array[:] = w.x.array[:]
w0.x.scatter_forward()
end_time = time.time()
print(f"Time-stepping finished. Steps run: {step}. Wall time: {end_time-start_time:.2f}s")
Notice that I have uncommented the heat source term S_*
, since it is not constant.
The indicative results from the log file are presented below for the current settings:
(iter:1) Newton iteration 2: r (abs) = 2.51453e-23 (tol = 1e-10) r (rel) = 1(tol = 1.49012e-10)
...
(iter:37) Newton iteration 2: r (abs) = 7.16731e-17 (...) r (rel) = 1.00101(...)
(iter:38) Newton iteration 2: r (abs) = 1.43472e-13 (...) r (rel) = 43.4519(...)
(iter:39) Newton iteration 2: r (abs) = 2.75733e-10 (...) r (rel) = 44.0977(...)
Newton iteration 3: r (abs) = 1.21123e-08 (...) r (rel) = 1937.1(...)
...
Newton iteration 49: r (abs) = 1.62576e+113 (...) r (rel) = 2.60007e+124(...)
Newton iteration 50: r (abs) = 7.74975e+115 (...) r (rel) = 1.23941e+127(...)
Up to the 38th iteration, the Newton solver finished in 2 iterations and 2 linear solver iterations.
The 37th iteration corresponds to .37 ps, and the 38th to .38 ps. Note that the pulse duration (t_p) is set to .3 ps and the maximum of the gaussian envelope is at .9 ps (3 \cdot t_p), which where the heat deposition is maximized.
I would highly appreciate it if you verify the above implementation and provide any feedback on the development (even redundant steps) or even questions on specific features/components.
I might transition to Picard’s iteration method (similar to this post) if the Newton solver does not converge despite the modifications or is highly sensitive to material (metal) selection. I haven’t tried the "residual"
SNES convergence criterion and any fieldsplit preconditioner under Nitche’s method (see this post for instance).
The fully nonlinear implementation requires the below cell before the (commented hereafter) fallback assignment:
Ce_vals = np.asarray(sim_2d_nonlinear.Ce_func(theta_e_vals), dtype=PETSc.ScalarType).reshape(-1)
G_vals = np.asarray(sim_2d_nonlinear.G_func(theta_e_vals), dtype=PETSc.ScalarType).reshape(-1)
ke_vals = np.asarray(sim_2d_nonlinear.ke_func(theta_e_vals, theta_l_vals), dtype=PETSc.ScalarType).reshape(-1)
t_scaled = 0.0
S_vals = np.asarray(sim_2d_nonlinear.S_func(coords_scaled, t_scaled,
theta_e_vals, theta_l_vals), dtype=PETSc.ScalarType).reshape(-1)
which I hope to employ after addressing the instabilities on the presented simplification.
Similarly, following the general route, I get divergence at the 38th iteration. Quite close to the former case.
I apologize for the lengthy response, but I deem useful the above context for detecting the root cause of the instability.