HPC Cluster installation - Dolfin 2019.1.0 - HDF5 error

Hello all,

I have been trying to install dolfin 2019.1.0 in the HPC cluster for a while now, and presently I seem to have issues with HDF5.

when I run a test code (not mpirun) on the master node I get the following error:

HDF5-DIAG: Error detected in HDF5 (1.8.12) thread 0:
  #000: ../../src/H5P.c line 275 in H5Pcreate(): not a property list class
    major: Invalid arguments to routine
    minor: Inappropriate type
Warning! ***HDF5 library version mismatched error***
The HDF5 header files used to compile this application do not match
the version used by the HDF5 library to which this application is linked.
Data corruption or segmentation faults may occur if the application continues.
This can happen when an application was compiled by one version of HDF5 but
linked with a different version of static or shared HDF5 library.
You should recompile the application or check your shared library related
settings such as 'LD_LIBRARY_PATH'.
You can, at your own risk, disable this warning by setting the environment
variable 'HDF5_DISABLE_VERSION_CHECK' to a value of '1'.
Setting it to 2 or higher will suppress the warning messages totally.
Headers are 1.12.2, library is 1.8.12

General Information:
		   HDF5 Version: 1.8.12
		  Configured on: Wed May  8 18:40:19 UTC 2019
		  Configured by: mockbuild@buildvm-03.phx2.fedoraproject.org
		 Configure mode: production
		    Host system: x86_64-redhat-linux-gnu
	      Uname information: Linux buildvm-03.phx2.fedoraproject.org 5.0.6-200.fc29.x86_64 #1 SMP Wed Apr 3 15:09:51 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
		       Byte sex: little-endian
		      Libraries: static, shared
	     Installation point: /usr

In the case of sbatch run (mpirun), I get the following error:

Traceback (most recent call last):
  File "/mnt/beegfs/workdir/USER_WORKDIR/TESTPY/ElasticNewtonRaphson.py", line 8, in <module>
    from .cpp import __version__
ImportError: libhdf5.so.8: cannot open shared object file: No such file or directory


HDF5 (1.12.2) is already installed and as a module in the cluster.

The following are the steps followed in the installation:

module load gcc/10.2.0
module load anaconda3/2020.11
module load openmpi/4.1.4
module load cmake/3.19.7
module load openblas/0.3.15
module load boost/1.75.0
module load hdf5/1.12.2

conda create -n FEniCS  python=3.9.2
conda activate FEniCS

pip3 install numpy
pip3 install cython
pip3 install pkgconfig
pip3 install pybind11==2.2.4

cd /mnt/beegfs/workdir/USER_WORKDIR/CONDA/INSTALL

# Installing h5py to work with the HDF5 (1.12.2) - installed with parallel

export HDF5_LIBDIR=/mnt/beegfs/softs/opt/gcc_10.2.0/openmpi_4.1.4/hdf5/1.12.2/lib
export HDF5_INCLUDEDIR=/mnt/beegfs/softs/opt/gcc_10.2.0/openmpi_4.1.4/hdf5/1.12.2/include
export HDF5_DIR=/mnt/beegfs/softs/opt/gcc_10.2.0/openmpi_4.1.4/hdf5/1.12.2
export HDF5_VERSION=1.12.2
export HDF5_MPI="ON"

git clone https://github.com/h5py/h5py.git && cd h5py/

python3 setup.py config
python3 setup.py build
python3 setup.py install

cd ..

# Install PETSC

export PETSC_VERSION=3.15.5

wget -nc https://ftp.mcs.anl.gov/pub/petsc/release-snapshots/petsc-${PETSC_VERSION}.tar.gz \
     -O petsc-${PETSC_VERSION}.tar.gz --no-check-certificate

tar -xvf petsc-${PETSC_VERSION}.tar.gz

cd petsc-${PETSC_VERSION}

python3 ./configure \
       --with-mpi-dir=/mnt/beegfs/softs/opt/gcc_10.2.0/openmpi/4.1.4 \
       --with-openblas-dir=/mnt/beegfs/softs/opt/gcc_10.2.0/openblas/0.3.15 \
       --COPTFLAGS='-O3 -march=native' \
       --CXXOPTFLAGS='-O3 -march=native' \
       --FOPTFLAGS='-O3 -march=native' \
       --with-mpi=1 \
       --with-cxx-dialect=C++11 \
       --with-mkl_pardiso=0 \
       --with-debugging=0 \
       --download-ptscotch \
       --download-hypre \
       --download-scalapack \
       --download-mumps \
       --download-parmetis \
       --download-suitesparse \
       --download-superlu_dist \
       --download-metis \
       --download-blacs \
       --download-spai \
       --prefix=/mnt/beegfs/workdir/USER_WORKDIR/CONDA/SOFT/petsc-3.15.5 \
       --known-mpi-shared-libraries=1  \

make install
make check
export PETSC_DIR=/mnt/beegfs/workdir/USER_WORKDIR/CONDA/SOFT/petsc-3.15.5

export PETSC4PY_VERSION=3.15.1
pip3 install petsc4py==3.15.1

cd ..

# Install Eigen library for dolfin

export EIGEN_VERSION=3.3.9

wget -nc https://gitlab.com/libeigen/eigen/-/archive/${EIGEN_VERSION}/eigen-${EIGEN_VERSION}.tar.gz \
     -O eigen-${EIGEN_VERSION}.tar.gz
mkdir eigen-${EIGEN_VERSION}
tar -xf eigen-${EIGEN_VERSION}.tar.gz \
    -C eigen-${EIGEN_VERSION} \
    --strip-components 1
cd eigen-${EIGEN_VERSION}
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=/mnt/beegfs/workdir/USER_WORKDIR/CONDA/SOFT/eigen-3.3.9 \
      -DCMAKE_CXXFLAGS='-march=native' ..
make install

cd ../..

#Installing Dolfin 2019.1.0 source previously downloaded from git and modifications made

pip3 install fenics-ffc --upgrade
export FENICS_VERSION=$(python3 -c"import ffc; print(ffc.__version__)")
cd dolfin
mkdir build && cd build

cmake -DCMAKE_INSTALL_PREFIX=/mnt/beegfs/workdir/USER_WORKDIR/CONDA/SOFT/dolfin-2019.1.0 \
-DEIGEN3_INCLUDE_DIR=/mnt/beegfs/workdir/USER_WORKDIR/CONDA/SOFT/eigen-3.3.9/include/eigen3  .. 

make -j8
make install

source /mnt/beegfs/workdir/USER_WORKDIR/CONDA/SOFT/dolfin-2019.1.0/share/dolfin/dolfin.conf

cd ../python

export pybind11_DIR=/mnt/beegfs/workdir/USER_WORKDIR/CONDA/SOFT/pybind11-2.2.4/share/cmake/pybind11

pip -v install .

Within the cluster, there are multiple installations of HDF5 but I am trying to use just the parallel one.

The problem is exactly what it says. Your dolfin installation is saying it was built against libhdf5.so.8, meaning HDF5 version 1.8, but you haven’t made that version available.

Could we specify which hdf5 to use in the Cmake while building dolfin?
I am new to cmake and the installation procedures.

Well sure, if you want to rebuild dolfin, you can configure using -D HDF5_C_COMPILER_EXECUTABLE:FILEPATH=/usr/bin/h5pcc, which should pick up your hdf5 installation via h5pcc (you’ll almost certainly need a different filepath according to your loaded module).

But wouldn’t it be simpler to just load the hdf5 1.8 module?


@dparsons I’ll try a rebuild of fenics using this.
The hdf5 1.8 is actually installed in the root, but it is a serial version and not parallel. The hdf5 1.12.2 is a parallel version in a different directory.

I initially thought this was an issue with h5py and verified that it is linked with hdf5 1.12.2; but it seems dolfin is picking up the wrong version from path. I’ll get back to you on this soon.

1 Like