Error when trying to solve complex eigenvalue problem in parallel

ppenet · February 7, 2024, 9:28am

Hello everyone,

When trying to solve a complex eigenvalue problem with the latest stable version of dolfinx, I encountered problems when running my code in parallel, as the EPS.solve just will not converge, or throw a ‘double free or corruption (out)’ error.

To make sure this was not related to my code, I tried to run one of the demos on the dolfinx github : dolfinx/python/demo/demo_waveguide/demo_half_loaded_waveguide.py at main · FEniCS/dolfinx · GitHub .

First, I tried with the real build of petsc and slepc and dolfinx 0.7.3, that I use inside a conda environment using conda install -c conda-forge fenics-dolfinx. With this build, I was able to run the demo both in serial and in parallel.

Then, I tried to use a complex build : conda install -c conda-forge fenics-dolfinx petsc=*=*complex* slepc=*=*complex*
With this build however, the demo runs fine on 1 proc but when ran in parallel, I get the same problem that the one I get for my own code : when EPS.solve() is called, it never ends or returns a ‘double free or corruption (out)’ error.
I was wondering if anyone could reproduce this. Maybe there is something I am missing in the demo code to make it compatible with the complex build that I am also missing for my own personal code.
Thanks in advance !

ppenet · February 9, 2024, 11:57am

After investigating, I figure this is an error linked with MUMPS : when I set the factorization solver using

eps = SLEPc.EPS().create(msh.comm)
st = eps.getST()
K = st.getKSP()
K.setType(PETSc.KSP.Type.PREONLY)
PC = K.getPC()
PC.setType("lu")
PC.setFactorSolverType("mumps")
PC.setFromOptions()

I get the same problem even when running the demo on just 1 proc with the complex build (that was working fine before as the default factorization solver is not mumps in serial) : it does not end or throws double free or corruption (out). Here is a MWE thats basically taken from the demo :

import sys

from mpi4py import MPI

import numpy as np

import ufl
from basix.ufl import element, mixed_element
from dolfinx import default_scalar_type, fem, io, plot
from dolfinx.fem.petsc import assemble_matrix
from dolfinx.mesh import (CellType, create_rectangle, exterior_facet_indices,
                          locate_entities)

try:
    import pyvista
    have_pyvista = True
except ModuleNotFoundError:
    print("pyvista and pyvistaqt are required to visualise the solution")
    have_pyvista = False

try:
    from slepc4py import SLEPc
except ModuleNotFoundError:
    print("slepc4py is required for this demo")
    sys.exit(0)

from petsc4py import PETSc

w = 1
h = 0.45 * w
d = 0.5 * h
nx = 300
ny = int(0.4 * nx)

msh = create_rectangle(MPI.COMM_WORLD, np.array([[0, 0], [w, h]]), np.array([nx, ny]), CellType.quadrilateral)
msh.topology.create_connectivity(msh.topology.dim - 1, msh.topology.dim)

eps_v = 1
eps_d = 2.45


def Omega_d(x):
    return x[1] <= d


def Omega_v(x):
    return x[1] >= d


D = fem.functionspace(msh, ("DQ", 0))
eps = fem.Function(D)

cells_v = locate_entities(msh, msh.topology.dim, Omega_v)
cells_d = locate_entities(msh, msh.topology.dim, Omega_d)

eps.x.array[cells_d] = np.full_like(cells_d, eps_d, dtype=default_scalar_type)
eps.x.array[cells_v] = np.full_like(cells_v, eps_v, dtype=default_scalar_type)

degree = 1
RTCE = element("RTCE", msh.basix_cell(), degree)
Q = element("Lagrange", msh.basix_cell(), degree)
V = fem.functionspace(msh, mixed_element([RTCE, Q]))


lmbd0 = h / 0.2
k0 = 2 * np.pi / lmbd0

et, ez = ufl.TrialFunctions(V)
vt, vz = ufl.TestFunctions(V)

a_tt = (ufl.inner(ufl.curl(et), ufl.curl(vt)) - (k0**2) * eps * ufl.inner(et, vt)) * ufl.dx
b_tt = ufl.inner(et, vt) * ufl.dx
b_tz = ufl.inner(et, ufl.grad(vz)) * ufl.dx
b_zt = ufl.inner(ufl.grad(ez), vt) * ufl.dx
b_zz = (ufl.inner(ufl.grad(ez), ufl.grad(vz)) - (k0**2) * eps * ufl.inner(ez, vz)) * ufl.dx

a = fem.form(a_tt)
b = fem.form(b_tt + b_tz + b_zt + b_zz)

bc_facets = exterior_facet_indices(msh.topology)
bc_dofs = fem.locate_dofs_topological(V, msh.topology.dim - 1, bc_facets)
u_bc = fem.Function(V)
with u_bc.vector.localForm() as loc:
    loc.set(0)
bc = fem.dirichletbc(u_bc, bc_dofs)

### Solve the problem in SLEPc

A = assemble_matrix(a, bcs=[bc])
A.assemble()
B = assemble_matrix(b, bcs=[bc])
B.assemble()


eps = SLEPc.EPS().create(msh.comm)
eps.setOperators(A, B)

eps.setProblemType(SLEPc.EPS.ProblemType.GNHEP)

tol = 1e-9
eps.setTolerances(tol=tol)


eps.setType(SLEPc.EPS.Type.KRYLOVSCHUR)

# Get ST context from eps
st = eps.getST()

# Set shift-and-invert transformation
st.setType(SLEPc.ST.Type.SINVERT)
eps.setWhichEigenpairs(SLEPc.EPS.Which.TARGET_REAL)

eps.setTarget(-(0.5 * k0)**2)

eps.setDimensions(nev=1)

###ADDITIONAL PARAMETERS FOR EIGENSOLVER 

K = st.getKSP()
K.setType(PETSc.KSP.Type.PREONLY)
PC = K.getPC()
PC.setType("lu")
PC.setFactorSolverType("mumps")
PC.setFromOptions()

###

eps.solve()
eps.view()
eps.errorView()


vals = [(i, np.sqrt(-eps.getEigenvalue(i))) for i in range(eps.getConverged())]
vals.sort(key=lambda x: x[1].real)

eh = fem.Function(V)

kz_list = []

for i, kz in vals:
    # Save eigenvector in eh
    eps.getEigenpair(i, eh.vector)
    # Compute error for i-th eigenvalue
    error = eps.computeError(i, SLEPc.EPS.ErrorType.RELATIVE)
    # Verify, save and visualize solution
    if error < tol and np.isclose(kz.imag, 0, atol=tol):
        kz_list.append(kz)
        print(f"eigenvalue: {-kz**2}")
        print(f"kz: {kz}")
        print(f"kz/k0: {kz / k0}")

Can this be a problem with my installation of petsc/slepc or is anyone able to reproduce the problem ? Also, switching the type of preconditioner to cholesky gives the same problem.
Switching the factorization solver to superlu_dist seems to make the solver converge in parallel, but takes an unexpectedly long time ?

ppenet · February 13, 2024, 4:11pm

Hi everyone, sorry for the monologue but I found a ‘partial’ solution to my problem so I figured it might be useful to other people. I also think this is a bug report, but I might be wrong.

Since the beginning of 2024, the default version of the hdf5 package on conda-forge is 1.14.3. After comparing the packages with an older installation I had of an older version of dolfinx that was running complex eigensolvers fine, I tried to install dolfinx in a conda env as follows :

conda install -c conda-forge python=3.10 mpich fenics-dolfinx hdf5=1.14.2 petsc=*=complex* slepc=*=complex*

to downgrade the hdf5 package. This solved my problem : I can now run the demo with a complex build in parallel with MUMPS.

This command does not install the latest stable release 0.7.3 of dolfinx, but the 0.7.1. It also does not install the latest releases of petsc, slepc and mumps. I haven’t looked too much for a way to have both the latest version of dolfinx and hdf5 1.14.2 as this is enough for me for now.

francesco-ballarin · February 14, 2024, 5:50pm

cc @minrk in case this is a report that he wants to look into

Topic		Replies	Views
MUMPS solver does not work in complex mode dolfinx	10	587	September 10, 2024
Problem running complex dolfinx demo in parallel General dolfinx , mpi	2	216	March 4, 2024
Solving Schrodinger using complex support dolfinx	6	598	May 11, 2022
My solution is complex and I cant figure out why! dolfinx	10	418	June 20, 2023
Simultaneous installation of real and complex builds General	7	549	February 28, 2023

Error when trying to solve complex eigenvalue problem in parallel

Related topics