Building from source: mpi4py, petsc4py conflict

I’m trying to build fenics from source on our HPC facility, SHARCNET. (This is not my first rodeo – I have successfully installed from source on other linux systems.) I downloaded the repositories for fiat, ffc, ufl, dijitso, dolfin, mshr earlier today (with git clone …) Everything works until I try to install dolfin python. I’m installing into a virtualenv, located at $FENICS, and I’m in that virtualenv, so pip3 is the virtualenv pip3. So I do this:

cd dolfin.19_03_23
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=$FENICS …
make VERBOSE=1
make VERBOSE=1 install

That seems to work fine. Now for python:

cd python
setenv PYBIND11_DIR $FENICS # (Yes, I’m using tcsh. So sue me.)
pip3 -vv install .

This produces many horrible error messages. Here’s a link to the full output: http://www.math.uwaterloo.ca/~lavery/fenics/install.out . I’ll just copy one of the messages here:

In file included from /home/lavery/projects/def-bingalls/lavery/FENICS/lib/python3.7/site-packages/mpi4py/include/mpi4py/mpi4py.h:14:0,
from /tmp/pip-req-build-5tjlo4ma/src/mpi_casters.h:28,
from /tmp/pip-req-build-5tjlo4ma/src/casters.h:26,
from /tmp/pip-req-build-5tjlo4ma/src/la.cpp:29:
/home/lavery/projects/def-bingalls/lavery/FENICS/lib/python3.7/site-packages/mpi4py/include/mpi4py/mpi4py.MPI_api.h: In function ‘int import_mpi4py__MPI()’:
/home/lavery/projects/def-bingalls/lavery/FENICS/lib/python3.7/site-packages/mpi4py/include/mpi4py/mpi4py.MPI_api.h:261:113: error: cannot convert ‘const char*’ to ‘PyObject* {aka _object*}’ for argument ‘1’ to ‘PyTypeObject* __Pyx_ImportType(PyObject*, const char*, const char*, size_t, __Pyx_ImportType_CheckSize)’
__pyx_ptype_6mpi4py_3MPI_Status = __Pyx_ImportType(“mpi4py.MPI”, “Status”, sizeof(struct PyMPIStatusObject), 1); if (!__pyx_ptype_6mpi4py_3MPI_Status) goto bad;

I get that message about two dozen times. In addition, I get a similar warning that seems to come from petsc4py, just once:

In file included from /home/lavery/projects/def-bingalls/lavery/FENICS/lib/python3.7/site-packages/petsc4py/include/petsc4py/petsc4py.h:9:0,
from /tmp/pip-req-build-5tjlo4ma/src/petsc_casters.h:33,
from /tmp/pip-req-build-5tjlo4ma/src/casters.h:27,
from /tmp/pip-req-build-5tjlo4ma/src/nls.cpp:37:
/home/lavery/projects/def-bingalls/lavery/FENICS/lib/python3.7/site-packages/petsc4py/include/petsc4py/petsc4py.PETSc_api.h:208:22: warning: ‘PyTypeObject* __Pyx_ImportType(PyObject*, const char*, const char*, size_t, __Pyx_ImportType_CheckSize)’ used but never defined
static PyTypeObject __Pyx_ImportType(PyObject module, const char *module_name, const char *class_name, size_t size, enum __Pyx_ImportType_CheckSize check_size);
^~~~~~~~~~~~~~~~

I have mpi4py 3.0.0 and petsc4py 3.10.1. I haven’t really checked that they work yet, but I can import them successfully in ipython, and one of the demos that comes with petsc4py runs correctly.

In case it is relevant, here’s some info about compilers:

$ mpicc --version
icc (ICC) 18.0.3 20180410
Copyright © 1985-2018 Intel Corporation. All rights reserved.

$ mpic++ --version
icpc (ICC) 18.0.3 20180410
Copyright © 1985-2018 Intel Corporation. All rights reserved.

$ cc --version
gcc (GCC) 7.3.0
Copyright © 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ c++ --version
c++ (GCC) 7.3.0
Copyright © 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

and the OS:
$ uname -a
Linux gra-login2 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 GNU/Linux

Thrashing around: I tried uninstalling mpi4py, then installing dolfin. That seems to work, in the sense that pip3 produces no error messages. But if I then try to import fenics in ipython, it throws an exception:

/project/6002183/lavery/fenics/dolfin.19_03_23/python/dolfin/init.py in ()
32
33 # Import cpp modules
—> 34 from .cpp import version
35
36 from .cpp.common import (Variable, has_debug, has_hdf5, has_scotch,

ModuleNotFoundError: No module named ‘dolfin.cpp’

Thanks for any help.

Another thing to look into might be to use a Singularity image, to avoid having to build from source on the cluster. A quick Google search shows that at least some SHARCNET resources support Singularity. You can build Singularity images directly from Docker containers on your personal machine. I found this to be the most convenient way to run FEniCS on TACC supercomputers.

OK, I got it installed though with some pain. First, after some more searching, I found that the mpi4py incompatibility had previously been reported here: https://bitbucket.org/mpi4py/mpi4py/issues/112/new-release-needed . And there was a new release of mpi4py to fix it: 3.0.1. I tried to install that with pip, but pip refused to acknowledge that there was a version more recent than 3.0.0. I therefore downloaded the mpi4py-3.0.1 tarball and installed from source. That went fine. And, miracle of miracles, the pip install of dolfin also worked without any reported errors. But there was still trouble in paradise. When I tried to import fenics, I got errors about conflicts in the shared object cpp.cpython-37m-x86_64-linux-gnu.so . In particular, it was loading several libraries from /lib64. On this system, those libraries are largely nonfunctional. Instead, we need to use shared libraries found in Lua environment modules whose directories are listed in LIBRARY_PATH. The question then was why the cpp shared library insisted on looking in /lib64. So now, in order to do the build somewhere that I could get back to (pip does the build in temporary directories which it then deletes), I built by direct invocation of setup.py:

setenv PYBIND11_DIR $FENICS
python setup.py -v build
python setup.py -v install

To find out why /lib64 objects were being invoked, I edited CMakeLists.txt to contain the following line:

set(CMAKE_VERBOSE_MAKEFILE ON)

I then repeated the build and install, this time getting full output from make. The command to create the dolfin.cpp library was roughly as follows (I have shortened it):

c++ -fPIC -DVERSION_INFO=“2018.2.0.dev0” -O3 -DNDEBUG -shared -Wl,-soname,cpp.cpython-37m-x86_64-linux-gnu.so -o …/lib.linux-x86_64-3.7/dolfin/cpp.cpython-37m-x86_64-linux-gnu.so CMakeFiles/cpp.dir/src/dolfin.cpp.o <many more .o files> -L/lib64 -Wl,-rpath,/lib64 -Flto /home/lavery/projects/def-bingalls/lavery/FENICS/lib64/libdolfin.so.2018.2.0.dev0

It is the -L/lib64 -Wl,-rpath,/lib64 options that cause the trouble, of course. I thus manually executed the command with these two options deleted. That gave me a shared object that, according to ldd, references nothing in /lib64. I copied it into the appropriate places in my virtualenv. Now I can import fenics and mpi4py. What’s more:

In [2]: import fenics as fe

In [3]: fe.MPI.comm_world
Out[3]: <mpi4py.MPI.Intracomm at 0x7fdb3412ef30>

So the fenics mpi4py integration is indeed working as it ought to. At the moment, it looks like a happy ending.

We do indeed have singularity available. I will look at this for the future.

Thanks.

1 Like