Problems about parallel computing

I am trying to solve a optimization problem with Ray. I’m wrapping the FLow past cylinder tutorial as a function, where the input to the function is the coordinates of the cylinder.And I distribute the same code with different input to different 35 cpus.But it didn’t save time as I expected.Here are the info of the tqdm:

Solving PDE:   0%|          | 8/11200 [02:15<59:00:36, 18.98s/it]
Solving PDE:   0%|          | 4/11200 [01:21<65:39:26, 21.11s/it]
Solving PDE:   0%|          | 4/11200 [01:03<55:12:52, 17.75s/it]
Solving PDE:   0%|          | 12/11200 [02:38<64:19:44, 20.70s/it]
Solving PDE:   0%|          | 7/11200 [01:56<60:13:30, 19.37s/it]
Solving PDE:   0%|          | 10/11200 [02:26<46:57:27, 15.11s/it]
Solving PDE:   0%|          | 10/11200 [02:23<49:09:12, 15.81s/it]
Solving PDE:   0%|          | 7/11200 [01:56<63:11:54, 20.33s/it]

For comparison, here are the info of running a single script:

Solving PDE:   2%|▏         | 260/11200 [00:38<27:06,  6.73it/s]

Could you tell me the reason why it’s slow or what should I do to accelerate?

@jiangzhangze

Even if you distribute the problem, if the problem itself is not parallelized by MPI, I do not think you could expect speed up as you still use a single CPU for a single problem.

1 Like