Different result in serial and parallel run

I am currently solving a fluid mechanics problem. I am storing the velocities as individual vectors using the XDMFFile’s write_checkpoint. I then use this mesh function data to calculate the gradients in another post-processing script using read_checkpoint. I found out that the order of the cell_dofs and cells is quite different in serial and parallel execution. This is resulting in a different gradient map. Should the cells be reordered in the post-processing script? Or is it a different issue? I have attached some images to highlight the problem. Kindly suggest the necessary correction.

You need to supply a minimal a code example, as this is implementation specific. If you have implemented things correctly, serial and parallel should yield the same result. The cells and degrees of freedom will naturally be in a different order, as the data is distributed over multiple processors, each one only owning a subset of the cells (and dofs).

Hi Dokken,

Here’s the minimum working code.

Problem code (named it file1.py):

from __future__ import print_function
import os
from hashlib import sha1
from pathlib import Path
import sys
from dolfin import *
import numpy as np

mesh = UnitCubeMesh(10,10,10)
U =  FunctionSpace(mesh,"CG",1)
F0 = Function(U)
F1 = Function(U)
E = Expression("4*x[0] + 5*x[1] + 6*x[2]", t = 0.0, degree=1)
F0.interpolate(E)
filename = "test.h5"
meshpath = "mesh2.xdmf"
with XDMFFile(MPI.comm_world, meshpath) as mesh_file:
    mesh_file.write(mesh)
filename1 = "test.xdmf"
hdf5_file = XDMFFile(MPI.comm_world, filename1)
hdf5_file.write_checkpoint(F0, "F",XDMFFile.Encoding.HDF5)

Post-processing code (named it file2.py):

from __future__ import print_function
import os
from hashlib import sha1
from pathlib import Path
import sys
from dolfin import *
import numpy as np

filename = "test.h5"
meshpath = "mesh2.xdmf"
mesh = Mesh()
with XDMFFile(MPI.comm_world, meshpath) as mesh_file:
    mesh_file.read(mesh)
Q =  FunctionSpace(mesh,"CG",1)
F1 = Function(Q)
F2 = Function(Q)
mu = 0.5
hdf5_file = XDMFFile(MPI.comm_world, "test.xdmf")
hdf5_file.read_checkpoint(F1, "F", 0)
F2.vector()[:] = F1.vector()[:] * 1000 * mu
file = File("F2.pvd")
file << F2

In a fenics environment I executed using the following commands:

  1. Serial:

python file1.py

  1. Parallel:

mpirun -n 2 python file1.py

For the post-processing code it was simply:

python file2.py

Output from for both serial and parallel execution file2.py is shown below:

I also tried
parameters[“reorder_dofs_serial”] = False/True along with a combination parameters[“reorder_dofs_library”] = “random”/“Boost”/“SCOTCH” in file2.py, but I couldn’t match the serial result. What am I missing?

1 Like

Could be a “feature” of the DoF ordering when generating the mesh. DOLFIN’s mesh generators run on a single process then distribute to all processes. This is inconsistent with HDF5 file reads/writes which are parallel.

Generate your mesh and save it to file first (via xdmf/HDF5) in a step before “Problem code”. Then read that mesh in instead of generating it in your “Problem code”. Then continue as normal. This should ensure the ordering of the DoFs are consistent in parallel read_checkpoint.

4 Likes

Thanks Nate. It worked.

I used the mesh from the serial run for post-processing the parallel. It worked.

1 Like