Hi everyone,
I’m using a FEniCS SingularityCE (version 3.9.8) container built from quay.io/fenicsproject/stable:latest
and I have a weird issue with XDMFFile
.
Consider the following script:
# test.py
import fenics
import sys
mesh = fenics.UnitSquareMesh(250, 250)
V = fenics.FunctionSpace(mesh, "CG", 1)
function = fenics.interpolate(fenics.Constant(1.), V)
xdmf_file = fenics.XDMFFile("saved_sim/test.xdmf")
for i in range(100):
fenics.MPI.comm_world.Barrier()
xdmf_file.write(function, i)
if fenics.MPI.comm_world.Get_rank() == 0:
print(f"Step {i} done")
sys.stdout.flush()
If I execute this script in parallel with mpirun
, from time to time I get this:
HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) MPI-process 5:
#000: ../../../src/H5Dio.c line 268 in H5Dwrite(): can't prepare for writing data
major: Dataset
minor: Write failed
#001: ../../../src/H5Dio.c line 344 in H5D__pre_write(): can't write data
major: Dataset
minor: Write failed
#002: ../../../src/H5Dio.c line 788 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#003: ../../../src/H5Dmpio.c line 529 in H5D__contig_collective_write(): couldn't finish shared collective MPI-IO
major: Low-level I/O
minor: Write failed
#004: ../../../src/H5Dmpio.c line 1400 in H5D__inter_collective_io(): couldn't finish collective MPI-IO
major: Low-level I/O
minor: Can't get value
#005: ../../../src/H5Dmpio.c line 1444 in H5D__final_collective_io(): optimized write failed
major: Dataset
minor: Write failed
#006: ../../../src/H5Dmpio.c line 297 in H5D__mpio_select_write(): can't finish collective parallel write
major: Low-level I/O
minor: Write failed
#007: ../../../src/H5Fio.c line 196 in H5F_block_write(): write through metadata accumulator failed
major: Low-level I/O
minor: Write failed
#008: ../../../src/H5Faccum.c line 827 in H5F__accum_write(): file write failed
major: Low-level I/O
minor: Write failed
#009: ../../../src/H5FDint.c line 285 in H5FD_write(): driver write request failed
major: Virtual File Layer
minor: Write failed
#010: ../../../src/H5FDmpio.c line 1789 in H5FD_mpio_write(): MPI_File_write_at_all failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#011: ../../../src/H5FDmpio.c line 1789 in H5FD_mpio_write(): Other I/O error , error stack:
ADIOI_NFS_WRITECONTIG(71): Other I/O error Input/output error
major: Internal error (too specific to document in detail)
minor: MPI Error String
Executing test.py
multiple times, I estimated that the command write
fails about 1% of the times.
What makes this even weirder is that I used FEniCS many times and I never had this issue before.
Does anybody has a clue on how to fix?
Thank you in advance.