Dear all,
I don’t know if you have encountered the problem of not being able to import grids when using dolfinx for large-scale operations.I tested the 1024 process and 2048 process, and found that the 1024 process can import easily, but there will be no output in the 2048 process. Through my multiple output debugging attempts, I am confident that the code did not run smoothly when importing the grid.
Secondly, try first just reading the topology and geometry data, not actually creating the mesh (which calls a mesh partitioner), as illustrated in the referenced code above.
That should not be an issue. DOLFINx has been run with over 100 000 procs.
srun: ROUTE: split_hostlist: hl=a3105n04,b2101r5n[1-8],b3201r3n[6-8],b3201r4n1,b3205r3n7,b3208r6n[3-4],b3302r3n[7-8],b3303r3n[3-4],b3303r4n[1-4],b3309r4n8,b3309r5n[1-2],b3309r7n[2-3],e2306r2n[1-3] tree_width 0
[2025-09-13 18:11:36.581] [info] Opened HDF5 file with id "72057594037927936"
[2025-09-13 18:11:36.581] [info] Opened HDF5 file with id "72057594037927936"
......
[2025-09-13 18:11:49.867] [info] HDF5 Read data rate: 43.15997436780609 MB/s
[2025-09-13 18:11:49.868] [info] HDF5 Read data rate: 44.45196286907281 MB/s
[2025-09-13 18:11:49.868] [info] HDF5 Read data rate: 43.764222893425725 MB/s
[2025-09-13 18:11:49.870] [info] HDF5 Read data rate: 44.18377704887125 MB/s
[2025-09-13 18:11:49.870] [info] HDF5 Read data rate: 44.25279702747072 MB/s
[2025-09-13 18:11:49.870] [info] HDF5 Read data rate: 44.23919898165143 MB/s
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: error: *** JOB 22222733 ON a3105n04 CANCELLED AT 2025-09-13T19:11:13 DUE TO TIME LIMIT ***
slurmstepd: error: *** STEP 22222733.0 ON a3105n04 CANCELLED AT 2025-09-13T19:11:13 DUE TO TIME LIMIT ***
why are you trying to use more than 2000 procs to read this? There will likely be more overhead in parallel communication than the reduced size per process will give you in speedup.
I do unfortunately not have a system with 2000+ procs available, so I won’t be able to reproduce the issue.
How did you install dolfinx on your cluster? Is there a possibility to run in debug mode (compile c++ part of DOLFINx so that assertions are not skipped).
Although the grid is not large, with only 3 * 10 ^ 5 cells, there are also 2 * 10 ^ 6 dofs defined for the second-order scalar field above. I need to solve the mixed finite element equations of (u, p, B, T) four physical fields in this grid The total number of degrees of freedom is 20 million .I want to conduct strong scalability testing.
I installed dolfinx through conda on the cluster.version is 0.9
2000 processes for 20 million cells seems overboard, as you are down to 10 000 cells per process.
Do you have access to an HPC system with 2000 processes that uses conda?
I would strongly suggest using Spack on HPC systems, as conda is pre-built binaries not optmized for that particular system.
Dear dokken,
Thank you for your suggestion.
I only want to conduct strong scalability testing to analyze the performance of the solver, and I won’t actually use 2000+processes to run the code.
Is this problem probably not caused by Conda?
Second to what @dokken states, using Conda for an HPC build is typically a really bad idea. To get performance from HPC clusters you really have to tailor your build to the system. This is not trivial.
Have you tried solving your problem on a simple workstation? 20M DoFs is really not enough to warrant use of 2k processes.