Inconsistent Mesh Conversion Results on Laptop vs HPC Using Legacy DOLFIN

Hi everyone,

I’m encountering an inconsistency when running the same legacy DOLFIN (FEniCS) code on my laptop and on an HPC cluster. The code reads a Gmsh-generated mesh and then proceeds with a finite element computation. However, the converted mesh and final matrix sizes differ across platforms.

On my laptop:

    Expecting 2924 vertices
    Found all vertices
    Expecting 17339 cells
    Found all cells
    Conversion done

On the HPC:

    Expecting 2938 vertices
    Found all vertices
    Expecting 17368 cells
    Found all cells
    Conversion done

The discrepancy arises during the conversion step from Gmsh .msh to DOLFIN .xml format even though the same .msh file is used.
What could cause dolfin-convert to produce different results when converting the same Gmsh .msh file on the different operating systems?

Thanks in advance!
@dokken

Dear @Noah_William,

Please don’t tag me in general questions.

Secondly, you have not provided your GMSH script, msh-file or anything that could potentially reproduce the issue.

You have not provided how dolfin was installed on the different systems, or what versions are being used.

Finally, please note that dolfin-convert has been deprecated for several years. Please use meshio for converting the mesh to a compatible (preferably xdmf) format.

1 Like

Hi @dokken ,

Thanks for responding to my question, and sorry for tagging you in a general question.

GMSH script:

tmpDir=sys.argv[1]
vals=list(map(eval,sys.argv[2:]))
rmax,S,na,nref=vals[:4]
v=array(vals[4:])
v.shape=na,3
xi=v[:,0]
yi=v[:,1]
zi=v[:,2]
#####
geoFile="%s/tmp.geo" % (tmpDir,)
gf=open(geoFile,"w")
msString="""Function Mysphere;

  p1 = newp; Point(p1) = {x,  y,  z} ;
  p2 = newp; Point(p2) = {x+r,y,  z} ;
  p3 = newp; Point(p3) = {x,  y+r,z} ;
  p4 = newp; Point(p4) = {x,  y,  z+r} ;
  p5 = newp; Point(p5) = {x-r,y,  z  } ;
  p6 = newp; Point(p6) = {x,  y-r,z  } ;
  p7 = newp; Point(p7) = {x,  y,  z-r} ;

  c1 = newreg; Circle(c1) = {p2,p1,p7};
  c2 = newreg; Circle(c2) = {p7,p1,p5};
  c3 = newreg; Circle(c3) = {p5,p1,p4};
  c4 = newreg; Circle(c4) = {p4,p1,p2};
  c5 = newreg; Circle(c5) = {p2,p1,p3};
  c6 = newreg; Circle(c6) = {p3,p1,p5};
  c7 = newreg; Circle(c7) = {p5,p1,p6};
  c8 = newreg; Circle(c8) = {p6,p1,p2};
  c9 = newreg; Circle(c9) = {p7,p1,p3};
  c10 = newreg; Circle(c10) = {p3,p1,p4};
  c11 = newreg; Circle(c11) = {p4,p1,p6};
  c12 = newreg; Circle(c12) = {p6,p1,p7};

  l1 = newreg; Line Loop(l1) = {c5,c10,c4};   Ruled Surface(newreg) = {l1};
  l2 = newreg; Line Loop(l2) = {c9,-c5,c1};   Ruled Surface(newreg) = {l2};
  l3 = newreg; Line Loop(l3) = {c12,-c8,-c1}; Ruled Surface(newreg) = {l3};
  l4 = newreg; Line Loop(l4) = {c8,-c4,c11};  Ruled Surface(newreg) = {l4};
  l5 = newreg; Line Loop(l5) = {-c10,c6,c3};  Ruled Surface(newreg) = {l5};
  l6 = newreg; Line Loop(l6) = {-c11,-c3,c7}; Ruled Surface(newreg) = {l6};
  l7 = newreg; Line Loop(l7) = {-c2,-c7,-c12};Ruled Surface(newreg) = {l7};
  l8 = newreg; Line Loop(l8) = {-c6,-c9,c2};  Ruled Surface(newreg) = {l8};
  theloops[t] = newreg;
  Surface Loop(theloops[t]) = {l8+1,l5+1,l1+1,l2+1,l3+1,l7+1,l6+1,l4+1};
  
Return
Field[1]=ExternalProcess;
Field[1].CommandLine = "../../size.py"; 
Background Field = 1;"""
print ( msString,file=gf)
print ( "x=%f ; y=%f ; z=%f; r=%f ; t= %d ;" % (0,0,0,rmax,0),file=gf)
print (  "Call Mysphere;",file=gf)
print ( """
Volume (186)={theloops[]};
Physical Volume(999)= 186;""",file=gf)
print ("Mesh.MshFileVersion = 2;",file=gf)
for i,ri in enumerate(zip(xi,yi,zi)):
    x,y,z=ri
    print ("p11=newp ;Point(p11)={%f,%f,%f};" % (x,y,z),file=gf)
    print ("Point{p11} In Volume{186};",file=gf)
gf.close()
# replace invalid double minus "--" by +
gf=open(geoFile,"r")
tmp=gf.read()
newfile=tmp.replace("--","+")
gf.close()
gf=open(geoFile,"w")
gf.write(newfile)
gf.close()
os.system("gmsh -3 %s/tmp.geo" % (tmpDir,))
for i in range(nref):
    os.system("gmsh -format msh2 -refine %s/tmp.msh" % (tmpDir,))

size.py

while(True):
    xyz = struct.unpack("ddd", os.read(0,24))
    x,y,z=xyz
    if math.isnan(x):
        break
    f = sizefield(x,y,z)
    os.write(1,struct.pack("d",f))

sizefield.py


hm=0.1
S=load("S.npy")[0]
ni=loadtxt("nucCoords.dat")
def sizefield(x,y,z):
    p=array([x,y,z])
    di=[linalg.norm(p-v) for v in ni]
    dm=min(di)
    size=S*sqrt(dm**2+hm**2) 
    return size

Please know that I have tried it using both FEnics and FEnicsx, and I am getting the same results.
Input in the main code:

def convert_msh_to_xdmf(msh_file, xdmf_file):
    mesh = meshio.read(msh_file)
    print(f"Read mesh with {len(mesh.points)} vertices")
    print(f"Found {sum(len(cells.data) for cells in mesh.cells)} cells")

    meshio.write(xdmf_file, mesh)
    print(f"Converted {msh_file} to {xdmf_file}")

convert_msh_to_xdmf("./tmp.msh", "mesh.xdmf")

with io.XDMFFile(comm, "mesh.xdmf", "r") as xdmf:
    domain = xdmf.read_mesh(name="Grid")

Please assist!

Your GMSH script is not complete, as you haven’t provided:

  • The relevant imports
  • The input parameters passed through sys argv.

Do you get different results running on the same number of processors on your local system and the HPC system (i.e. serial/one process)?

Inputs:

['../../Gmsh_Sript.py', '.', '12.000000', '0.400000', '5', '0', '0', '0', '0', '-1.18599211577703', '-1.18599211577703', '-1.18599211577703', '-1.18599211577703', '1.18599211577703', '1.18599211577703', '1.18599211577703', '-1.18599211577703', '1.18599211577703', '1.18599211577703', '1.18599211577703', '-1.18599211577703']
  1. sys.argv[0] = '../../Gmsh_Sript.py' - The path to the script
  2. sys.argv[1] = '.' - The temporary directory (tmpDir) is the current directory
  3. sys.argv[2] = '12' - The radius of the sphere (rmax)
  4. sys.argv[3] = '0.4' - The parameter (S)
  5. sys.argv[4] = '5' - The number of points (na)
  6. sys.argv[5] = '0' - The number of refinement steps (nref)

The remaining arguments are the coordinates of the 5 points:


1. Point 1: (0, 0, 0)
2. Point 2: (-1.18599211577703, -1.18599211577703, -1.18599211577703)
3. Point 3: (-1.18599211577703, 1.18599211577703, 1.18599211577703)
4. Point 4: (1.18599211577703, -1.18599211577703, 1.18599211577703)
5. Point 5: (1.18599211577703, 1.18599211577703, -1.18599211577703)

When using both FEnics and FEnicsx on the local system, the two produce the same results

    Expecting 2924 vertices
    Found all vertices
    Expecting 17339 cells
    Found all cells
    Conversion done

On the HPC, the results for both FEnics and FEnicsx are the same but different from those on the laptop:

    Expecting 2938 vertices
    Found all vertices
    Expecting 17368 cells
    Found all cells
    Conversion done

The latter is surprising, as this would usually mean that the msh files differs on the two systems.
Have you tried using the msh file generated on one system on the other, say the msh file generated on your laptop on the HPC system, or vice versa?

Or are you generating the mesh on each system separately?

I introduced that when I started using FEniCSx and it was giving the same results, and it is giving the same results in both FEniCS & FEniCSx.

Initially, in FEnics, I was using this form below, and the results are the same even after the change.

meshCommandLine= ("%s %f %f %d %d") % (".",xmax,S,NA,nref) + ((3*NA)*("%20.15g ")) % tuple(list(tmpr))
os.system ( " ../../Gmsh_Script.py " + meshCommandLine +  "> gmsh_out" )
os.system ( "dolfin-convert" + " %s/tmp.msh" % (".",) +" mesh.xml")
mesh=Mesh("mesh.xml")

But could you share the msh file generated on each system? As I don’t have your system. I would need the two files, and see if they behave the same or differently to see if the issue is different versions of gmsh.

HPC.msh

$MeshFormat
2.2 0 8
$EndMeshFormat
$Nodes
2946
1 12 0 0
2 0 12 0
3 0 0 12
4 -12 0 0
5 0 -12 0
6 0 0 -12
7 0 0 0
8 -1.185992 -1.185992 -1.185992
9 -1.185992 1.185992 1.185992
10 1.185992 -1.185992 1.185992
11 1.185992 1.185992 -1.185992
12 11.41267435759174 0 -3.708215744474424
13 9.708189331880998 0 -7.053443123493089
14 7.053443088175355 0 -9.708189357540949
15 3.708215723791519 0 -11.41267436431204
16 -3.708215744474424 0 -11.41267435759174
.
.
.
17337 4 2 999 186 2149 2860 2114 286
17338 4 2 999 186 2114 2860 2149 1273
17339 4 2 999 186 1728 2422 675 925
17340 4 2 999 186 675 2422 1728 1304
17341 4 2 999 186 2312 2484 1563 570
17342 4 2 999 186 1563 2484 2312 868
.
.
.
17480 4 2 999 186 938 1497 1280 2327
17481 4 2 999 186 938 1280 1497 1573
17482 4 2 999 186 1942 2719 1972 650
17483 4 2 999 186 1972 2719 1942 642
17484 4 2 999 186 2541 1120 497 1259
17485 4 2 999 186 2541 497 1120 1889
$EndElements

Laptop.msh

$MeshFormat
2.2 0 8
$EndMeshFormat
$Nodes
2924
1 12 0 0
2 0 12 0
3 0 0 12
4 -12 0 0
5 0 -12 0
6 0 0 -12
7 0 0 0
8 -1.185992 -1.185992 -1.185992
9 -1.185992 1.185992 1.185992
10 1.185992 -1.185992 1.185992
11 1.185992 1.185992 -1.185992
12 11.41267435759174 0 -3.708215744474424
13 9.708189331880998 0 -7.053443123493089
14 7.053443088175355 0 -9.708189357540949
15 3.708215723791519 0 -11.41267436431204
16 -3.708215744474424 0 -11.41267435759174
.
.
.
17334 4 2 999 186 2700 2400 2699 1365
17335 4 2 999 186 2699 2400 2700 850
17336 4 2 999 186 1655 2697 2025 391
17337 4 2 999 186 2025 2697 1655 700
17338 4 2 999 186 2053 1760 1251 2385
17339 4 2 999 186 2053 1251 1760 666
$EndElements

gmsh --version (Both systems)

4.13.1

If you compare the end of the two files

you can observe that they are not identical. I would suggest you try to call your conversion script on the HPC system with the file Laptop.msh and compare the output of those two. Clearly something is not consistent in the mesh generation stage here