Segmentation fault when using adios4dolfinx

Hello,

I have been recently trying to use the adios4dolfinx library for checkpointing with FEniCSx.

To test the checkpointing functionality, I have done a completely fresh install of FEniCSx, PyVista and adios4dolfinx@v0.1.0 both on a MacOS laptop and on a Linux computer. On the MacOS laptop everything works flawless. However, on the Linux machine I have not been able to save any checkpoint since the function write_function of adios4dolfinx@v0.1.0 keeps throwing me the following segmentation error:

[0]PETSC ERROR: ------------------------------------------------------------------------

[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range

[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger

[0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/

[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run

[0]PETSC ERROR: to get more information on the crash.

[0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.

--------------------------------------------------------------------------

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD

with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.

You may or may not see output from other processes, depending on

exactly when Open MPI kills them.

--------------------------------------------------------------------------

I have also tried with different adios2 engines and adios4dolfinx versions but I continue getting the same error. Conversely, the function write_mesh of adios4dolfinx works perfectly on the Linux machine. Any idea of what may be causing this behavior of write_function?

Thank you,

Daniel

What version of adios2 is installed on the two systems?

I experienced this issue a while back when I tried to use ADIOS2 with the bp5 engine. It worked with the bp4 engine

Hi @dokken,

The version of adios2 is 2.8.3 on both systems (installed using micromamba on both).

I have just noticed that, in my case, this issue seems to be related to the package adios4dolfinx itself rather than to the write_function function. In fact, if I import this package but I do not use any of the functions that it provides, I still get the same segmentation error.

My belief is that this issue is related to the interaction between the package adios4dolfinx and some other packages. For instance, if do not use PyVista in my code the issue seems not to occur. Here you can see a MWE taken from the Poisson demo:

import numpy as np

import ufl
from dolfinx import fem, io, mesh, plot
from ufl import ds, dx, grad, inner

from mpi4py import MPI
from petsc4py.PETSc import ScalarType

import adios4dolfinx

import_pyvista = True

msh = mesh.create_rectangle(comm=MPI.COMM_WORLD,
                            points=((0.0, 0.0), (2.0, 1.0)), n=(32, 16),
                            cell_type=mesh.CellType.triangle,)
V = fem.FunctionSpace(msh, ("Lagrange", 1))

facets = mesh.locate_entities_boundary(msh, dim=1,
                                       marker=lambda x: np.logical_or(np.isclose(x[0], 0.0),
                                                                      np.isclose(x[0], 2.0)))

dofs = fem.locate_dofs_topological(V=V, entity_dim=1, entities=facets)

bc = fem.dirichletbc(value=ScalarType(0), dofs=dofs, V=V)

u = ufl.TrialFunction(V)
v = ufl.TestFunction(V)
x = ufl.SpatialCoordinate(msh)
f = 10 * ufl.exp(-((x[0] - 0.5) ** 2 + (x[1] - 0.5) ** 2) / 0.02)
g = ufl.sin(5 * x[0])
a = inner(grad(u), grad(v)) * dx
L = inner(f, v) * dx + inner(g, v) * ds

problem = fem.petsc.LinearProblem(a, L, bcs=[bc], petsc_options={"ksp_type": "preonly", "pc_type": "lu"})
uh = problem.solve()

with io.XDMFFile(msh.comm, "out_poisson/poisson.xdmf", "w") as file:
    file.write_mesh(msh)
    file.write_function(uh)

if import_pyvista:
    import pyvista
    
    pyvista.OFF_SCREEN = True

    cells, types, x = plot.create_vtk_mesh(V)
    grid = pyvista.UnstructuredGrid(cells, types, x)
    grid.point_data["u"] = uh.x.array.real
    grid.set_active_scalars("u")
    plotter = pyvista.Plotter()
    plotter.add_mesh(grid, show_edges=True)
    warped = grid.warp_by_scalar()
    plotter.add_mesh(warped)
    if pyvista.OFF_SCREEN:
        pyvista.start_xvfb(wait=0.1)
        plotter.screenshot("uh_poisson.png")
    else:
        plotter.show()

If I set import_pyvista = True the issue appears and when import_pyvista = False it does not occur. Notice that I am not using the package adios4dolfinx, but only importing it. Also, in case you wonder which is the version of PyVista that I have installed, it is 0.42.0 on both MacOS and Linux.

Could you check that it is not just pyvista that shows this error? I.e. what happens if you remove the adios4dolfinx import do you still see the pyvista error message?

adios4dolfinx is a pure python package, only relying on ADIOS2 (which is already a dolfinx optional dependency),

If I remove adios4dolfinx the error disappears.

If I just import adios2 instead of adios4dolfinx no issue is shown so adios2 is definitely not causing the problem. However, I tried to see what happens with other adios4dolfinx dependencies and I noticed that this error still happens when importing numba (version 0.57.1) itself. Might numba be breaking something when interacting with PyVista?

Im not sure.
You could remove the numba JIT from the source code manually if it helps you (as it is only there to speed it up, Nothing is strongly dependent on it).

Manually removing the numba dependency from adios4dolfinx solves the problem. Now, the code works even with the BP5 ADIOS2 engine.