Problem using and installing other python modules on docker image dolfinx/dolfinx:stable

Hello,

I am trying to use the stable docker image of dolfinx to run code. I am new to docker and might be doing silly mistakes.

I create a new docker container using the following code :

docker run -ti -v $(pwd):/root/shared -w /root/shared dolfinx/dolfinx:stable

I now have two problems :

first, hello world python code works fine but as soon as I import dolfinx in python script I get the following error :

root@04eac6a484bc:~/shared# python3 ks.py 
Abort(740937615): Fatal error in internal_Init_thread: Other MPI error, error stack:
internal_Init_thread(60).: MPI_Init_thread(argc=(nil), argv=(nil), required=3, provided=0x7ffd6abc56a0) failed
MPII_Init_thread(209)....: 
MPID_Init(443)...........: 
MPIDI_OFI_init_local(638): 
create_vni_context(1020).: 
create_vni_domain(1194)..: OFI fi_open domain failed (ofi_init.c:1194:create_vni_domain:Invalid argument)

Is this error linked to the “container settings” or to my host machine ?

Second :

Trying to install additional python packages doesn’t work and leads to this error :slight_smile:

root@ead8ef7f6bb9:~# pip install torch
Collecting torch
  Downloading torch-1.13.1-cp310-cp310-manylinux1_x86_64.whl (887.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/887.5 MB ? eta -:--:--ERROR: Exception:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/pip/_internal/cli/base_command.py", line 165, in exc_logging_wrapper
    status = run_func(*args)
  File "/usr/lib/python3/dist-packages/pip/_internal/cli/req_command.py", line 205, in wrapper
    return func(self, options, args)
  File "/usr/lib/python3/dist-packages/pip/_internal/commands/install.py", line 339, in run
    requirement_set = resolver.resolve(
  File "/usr/lib/python3/dist-packages/pip/_internal/resolution/resolvelib/resolver.py", line 94, in resolve
    result = self._result = resolver.resolve(
  File "/usr/lib/python3/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 481, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "/usr/lib/python3/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 348, in resolve
    self._add_to_criteria(self.state.criteria, r, parent=None)
  File "/usr/lib/python3/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 172, in _add_to_criteria
    if not criterion.candidates:
  File "/usr/lib/python3/dist-packages/pip/_vendor/resolvelib/structs.py", line 151, in __bool__
    return bool(self._sequence)
  File "/usr/lib/python3/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 155, in __bool__
    return any(self)
  File "/usr/lib/python3/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 143, in <genexpr>
    return (c for c in iterator if id(c) not in self._incompatible_ids)
  File "/usr/lib/python3/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 47, in _iter_built
    candidate = func()
  File "/usr/lib/python3/dist-packages/pip/_internal/resolution/resolvelib/factory.py", line 215, in _make_candidate_from_link
    self._link_candidate_cache[link] = LinkCandidate(
  File "/usr/lib/python3/dist-packages/pip/_internal/resolution/resolvelib/candidates.py", line 288, in __init__
    super().__init__(
  File "/usr/lib/python3/dist-packages/pip/_internal/resolution/resolvelib/candidates.py", line 158, in __init__
    self.dist = self._prepare()
  File "/usr/lib/python3/dist-packages/pip/_internal/resolution/resolvelib/candidates.py", line 227, in _prepare
    dist = self._prepare_distribution()
  File "/usr/lib/python3/dist-packages/pip/_internal/resolution/resolvelib/candidates.py", line 299, in _prepare_distribution
    return preparer.prepare_linked_requirement(self._ireq, parallel_builds=True)
  File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 487, in prepare_linked_requirement
    return self._prepare_linked_requirement(req, parallel_builds)
  File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 532, in _prepare_linked_requirement
    local_file = unpack_url(
  File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 214, in unpack_url
    file = get_http_url(
  File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 94, in get_http_url
    from_path, content_type = download(link, temp_dir.path)
  File "/usr/lib/python3/dist-packages/pip/_internal/network/download.py", line 146, in __call__
    for chunk in chunks:
  File "/usr/lib/python3/dist-packages/pip/_internal/cli/progress_bars.py", line 303, in _rich_progress_bar
    with progress:
  File "/usr/lib/python3/dist-packages/pip/_vendor/rich/progress.py", line 652, in __enter__
    self.start()
  File "/usr/lib/python3/dist-packages/pip/_vendor/rich/progress.py", line 643, in start
    self.live.start(refresh=True)
  File "/usr/lib/python3/dist-packages/pip/_vendor/rich/live.py", line 124, in start
    self._refresh_thread.start()
  File "/usr/lib/python3.10/threading.py", line 928, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread

Both errors seem to be linked to thread starting processes.

Has anyone been faced with similar problems?

Thanks in advance for your help.

Nicolas

Without a code illustrating how you get this error, it is hard to give you any guidance.

I cannot reproduce the other error you get:

dokken@laptop:~/Documents/src/debug$ docker run -ti -v $(pwd):/root/shared -w /root/shared dolfinx/dolfinx:stable
Unable to find image 'dolfinx/dolfinx:stable' locally
stable: Pulling from dolfinx/dolfinx
2b55860d4c66: Already exists 
4f4fb700ef54: Pull complete 
41286f91133a: Pull complete 
6a0b7cee828e: Pull complete 
549daafd9406: Pull complete 
7d1f15af39a1: Pull complete 
b7636ccd5b3f: Pull complete 
855ae7b5c117: Pull complete 
2c0f77f7b568: Pull complete 
0ba373695698: Pull complete 
e81a4600a3ad: Pull complete 
1abdc4b76ddc: Pull complete 
cb45eff63fd8: Pull complete 
7ccab1006c18: Pull complete 
Digest: sha256:c1904b2890340d62866b71df84a6303578ae38c7b78fb2a14b51818bba4b0b1b
Status: Downloaded newer image for dolfinx/dolfinx:stable
root@97afab0de258:~# python3 -m pip install torch
Collecting torch
  Downloading torch-1.13.1-cp310-cp310-manylinux1_x86_64.whl (887.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 887.5/887.5 MB 4.8 MB/s eta 0:00:00
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch) (4.3.0)
Collecting nvidia-cublas-cu11==11.10.3.66
  Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 317.1/317.1 MB 7.5 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu11==11.7.99
  Downloading nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 849.3/849.3 KB 10.8 MB/s eta 0:00:00
Collecting nvidia-cuda-nvrtc-cu11==11.7.99
  Downloading nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.0/21.0 MB 16.5 MB/s eta 0:00:00
Collecting nvidia-cudnn-cu11==8.5.0.96
  Downloading nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 557.1/557.1 MB 7.7 MB/s eta 0:00:00
Requirement already satisfied: wheel in /usr/lib/python3/dist-packages (from nvidia-cublas-cu11==11.10.3.66->torch) (0.37.1)
Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from nvidia-cublas-cu11==11.10.3.66->torch) (59.6.0)
Installing collected packages: nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cublas-cu11, nvidia-cudnn-cu11, torch
Successfully installed nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 torch-1.13.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

I’m using Docker version 20.10.22, build 3a2c30b

What version of docker are you using? it could be related to ubuntu:21.10 and fedora:35 do not work on the latest Docker (20.10.9) | by Akihiro Suda | nttlabs | Medium

Thanks for the quick response.

For the first error, it occurs as soon as I import dolfinx.

import dolfinx 

print("dolfinx imported"

for the second problem i can give you the following information :

docker version

Client:
 Version:           19.03.6
 API version:       1.40
 Go version:        go1.12.17
 Git commit:        369ce74a3c
 Built:             Fri Dec 18 12:21:44 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          19.03.6
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.17
  Git commit:       369ce74a3c
  Built:            Thu Dec 10 13:23:49 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.3.3-0ubuntu1~18.04.4
  GitCommit:        
 runc:
  Version:          spec: 1.0.1-dev
  GitCommit:        
 docker-init:
  Version:          0.18.0
  GitCommit:    

and my host machine uses :

DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

one last, possibly useful, information is the output of docker inspect for the running dolfinx/dolfinx:stable container :

[
    {
        "Id": "04eac6a484bc1b66ca79365a4e305b4e7611338c9ff3d6b103481e0bc4c1a452",
        "Created": "2023-01-10T14:27:35.10958232Z",
        "Path": "bash",
        "Args": [],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 59213,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2023-01-10T16:17:57.638675211Z",
            "FinishedAt": "2023-01-10T14:30:10.843442785Z"
        },
        "Image": "sha256:d4f62c13efae539d99faf886e8479689c01f9ced9a52ac68107247897c2ddeb5",
        "ResolvConfPath": "/var/lib/docker/containers/04eac6a484bc1b66ca79365a4e305b4e7611338c9ff3d6b103481e0bc4c1a452/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/04eac6a484bc1b66ca79365a4e305b4e7611338c9ff3d6b103481e0bc4c1a452/hostname",
        "HostsPath": "/var/lib/docker/containers/04eac6a484bc1b66ca79365a4e305b4e7611338c9ff3d6b103481e0bc4c1a452/hosts",
        "LogPath": "/var/lib/docker/containers/04eac6a484bc1b66ca79365a4e305b4e7611338c9ff3d6b103481e0bc4c1a452/04eac6a484bc1b66ca79365a4e305b4e7611338c9ff3d6b103481e0bc4c1a452-json.log",
        "Name": "/happy_goldberg",
        "RestartCount": 0,
        "Driver": "aufs",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "docker-default",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "/home/lepage/lepage_docs:/root/shared"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {}
            },
            "NetworkMode": "default",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "no",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "Capabilities": null,
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "private",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": [],
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DeviceCgroupRules": null,
            "DeviceRequests": null,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": null,
            "OomKillDisable": false,
            "PidsLimit": null,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "MaskedPaths": [
                "/proc/asound",
                "/proc/acpi",
                "/proc/kcore",
                "/proc/keys",
                "/proc/latency_stats",
                "/proc/timer_list",
                "/proc/timer_stats",
                "/proc/sched_debug",
                "/proc/scsi",
                "/sys/firmware"
            ],
            "ReadonlyPaths": [
                "/proc/bus",
                "/proc/fs",
                "/proc/irq",
                "/proc/sys",
                "/proc/sysrq-trigger"
            ]
        },
        "GraphDriver": {
            "Data": null,
            "Name": "aufs"
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/home/lepage/lepage_docs",
                "Destination": "/root/shared",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],
        "Config": {
            "Hostname": "04eac6a484bc",
            "Domainname": "",
            "User": "",
            "AttachStdin": true,
            "AttachStdout": true,
            "AttachStderr": true,
            "Tty": true,
            "OpenStdin": true,
            "StdinOnce": true,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "OPENBLAS_NUM_THREADS=1",
                "OPENBLAS_VERBOSE=0",
                "PYTHONPATH=/usr/local/dolfinx-real/lib/python3.10/dist-packages:/usr/local/lib:",
                "PETSC_DIR=/usr/local/petsc",
                "SLEPC_DIR=/usr/local/slepc",
                "PKG_CONFIG_PATH=/usr/local/dolfinx-real/lib/pkgconfig:",
                "PETSC_ARCH=linux-gnu-real-32",
                "LD_LIBRARY_PATH=/usr/local/dolfinx-real/lib:"
            ],
            "Cmd": [
                "bash"
            ],
            "Image": "dolfinx/dolfinx:stable",
            "Volumes": null,
            "WorkingDir": "/root/shared",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": {
                "description": "DOLFINx in 32-bit real and complex modes",
                "maintainer": "fenics-project <fenics-support@googlegroups.org>"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "d1f542350c1e31a51d4c3cbaed58c5ec51b4f0cac2df8aafd4b2fdd57b53474d",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/d1f542350c1e",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "90b86a9bbcd460504d6bb2a355d6b6f81bd5e2bf4e23a51cb9861a6520e167f3",
            "Gateway": "172.17.0.1",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "172.17.0.2",
            "IPPrefixLen": 16,
            "IPv6Gateway": "",
            "MacAddress": "02:42:ac:11:00:02",
            "Networks": {
                "bridge": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "b1559ef9c865cb4f8b690bd0e3bc63d23b43d4c6eeb97e5c8eb578a1e18b2d48",
                    "EndpointID": "90b86a9bbcd460504d6bb2a355d6b6f81bd5e2bf4e23a51cb9861a6520e167f3",
                    "Gateway": "172.17.0.1",
                    "IPAddress": "172.17.0.2",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:ac:11:00:02",
                    "DriverOpts": null
                }
            }
        }
    }
]

As I said, it is most likely due to the version of docker (19.03.06, as it was released in february 2020).

thanks a lot :slight_smile: I will try to change this.