Can't run Navier-Stokes solution in parallel

davidromerot · August 31, 2020, 4:28pm

Thanks @volkerk.
I implemented the modification you suggested, and that did not help the situation. I still kept it, because yours is the right way to do that operation. So thanks for that.

After much testing, it turns out that the problem is caused before reaching that part of the code. It happens before the loop even starts, in the following lines:

# Calculate dt as per CFL (with CFLmax = 0.5)
deltaX = mesh.hmin() # minimum mesh size
dt = 0.5*deltaX**2/mu
dt = 1E-2*dt # Just for testing
# number of time steps to reach target final time
num_steps = int(T/dt) + 1    # number of time steps

The problem is that the value of hmin may be different in each process, because the value of mesh.hmin() reported by each process is different (or may be different) depending on how the mesh is partitioned among the processes. I verified that these values are different by using a simple print statement following that calculation.

As a result, the value of num_steps can be different in each process, depending on the target final time T and how that and hmin affect the final results of the truncating operation int(T/dt) + 1. So I found myself in a situation in which 3 of the 4 processes I was using had a value of num_steps of 8, while one process had a value of 9. I believe this caused a hang/deadlock in interprocess communication, e.g. the process running an additional step (Process 2 in my case) waiting for a send/receive to complete before exiting.

At least that is my interpretation of what was happening. If anyone else has a different or better explanation, I would like to learn it. Otherwise, this should be marked as solved.

Thanks to anyone who read these posts and gave it a thought. Apologies for wasting your time on what should have been an easy-to-spot bug in the code.

Topic		Replies	Views
Fail to run the demo code in parallel with MPI	4	699	January 17, 2022
How can I solve transient problems in parallel?	3	1883	April 3, 2019
Problems in running in parallel dolfinx	10	806	April 21, 2023
Problem with solve() in parallel computing	0	708	February 21, 2020
Different results in serial and parallel run dolfinx dolfinx	23	2343	October 26, 2020

Can't run Navier-Stokes solution in parallel

Related Topics