Strange dijitso error during FFC compilation on MPI cluster - could be cache problem

I decided to take the plunge and build a small cluster using some spare old PCs in the office. Specs are:

  1. 3 x Dell Optiplex 9020 with 8 Intel core i7s, 500GB HDD, 32GB RAM all connected with gigabit ethernet switch.
  2. Installed Ubuntu 18.04 with all current updates. On master node, installed GUI as well as server, on slave nodes, only set up as servers with no graphical interface.
  3. Set up sshd for passwordless login from head node. Works.
  4. Set up nfsd on master and nfs clients on slave nodes with shared “Cluster” directory. Works
  5. Installed Fenics 2019.1.0 on both master and all slave nodes. OpenMPI was installed by default (mpirun version 2.2.1)
  6. Installed mshr and h5py with pip3

After configuring everything, I tested OpenMPI on all 24 nodes (3 Optiplex 9020s) by compiling the following code using mpicc -o MPIHelloWorld MPIHellowWorld.c

#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
    // Initialize the MPI environment
    MPI_Init(NULL, NULL);

    // Get the number of processes
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    // Get the rank of the process
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    // Get the name of the processor
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int name_len;
    MPI_Get_processor_name(processor_name, &name_len);

    // Print off a hello world message
    printf("Hello world from processor %s, rank %d out of %d processors\n",
           processor_name, world_rank, world_size);

    // Finalize the MPI environment.
    MPI_Finalize();
}

The result was encouraging. Using

mpirun --mca orte_base_help_aggregate 0 --mca btl tcp,self  -n 24 --host Oliver:8,node0:8,node1:8 MPIHelloWorld

I got:

Hello world from processor Oliver, rank 0 out of 24 processors
Hello world from processor Oliver, rank 1 out of 24 processors
Hello world from processor Oliver, rank 3 out of 24 processors
Hello world from processor Oliver, rank 4 out of 24 processors
Hello world from processor Oliver, rank 7 out of 24 processors
Hello world from processor Oliver, rank 2 out of 24 processors
Hello world from processor Oliver, rank 5 out of 24 processors
Hello world from processor Oliver, rank 6 out of 24 processors
Hello world from processor node1, rank 16 out of 24 processors
Hello world from processor node1, rank 19 out of 24 processors
Hello world from processor node1, rank 20 out of 24 processors
Hello world from processor node1, rank 21 out of 24 processors
Hello world from processor node1, rank 22 out of 24 processors
Hello world from processor node1, rank 17 out of 24 processors
Hello world from processor node0, rank 8 out of 24 processors
Hello world from processor node1, rank 18 out of 24 processors
Hello world from processor node0, rank 10 out of 24 processors
Hello world from processor node1, rank 23 out of 24 processors
Hello world from processor node0, rank 11 out of 24 processors
Hello world from processor node0, rank 12 out of 24 processors
Hello world from processor node0, rank 13 out of 24 processors
Hello world from processor node0, rank 14 out of 24 processors
Hello world from processor node0, rank 15 out of 24 processors
Hello world from processor node0, rank 9 out of 24 processors

Success! I tried to run my Monopole model (from here: Transitioning from mesh.xml to mesh.xdmf, from dolfin-convert to meshio), but it died during meshio operation. Apparently, meshio does not work with MPI, so separating the meshio code into another file and running it as a single-processor serial code (no MPI), it worked.

The meshio code:

import meshio
msh = meshio.read("Monopole2.msh")
meshio.write("mesh.xdmf",
                     meshio.Mesh(points = msh.points,
                                            cells = {'tetra': msh.cells_dict['tetra']}))

The full mesh file is retrievable here: GMSH mesh file link. It is a relatively large file.

Running the model on all 24 processor cores (3 PCs) using

mpirun --mca orte_base_help_aggregate 0 --mca btl tcp,self  -n 24 --host Oliver:8,node0:8,node1:8 python3 Monopole2.py

The model code is here

# Put in file named Monopole2.py
from dolfin import *
import numpy as np
import cmath as cm
import os, sys, traceback

h = 0.15
semicirc_r = 2.0
a = 0.8
b = 0.8
ts = 0.1  
lc = 1.0
rc = 0.11
cc = 0.0325
poffset = 0.0
qoffset = 0.0
pi = 3.1415926536
tol = 1.0e-12
eta = 377.0
k0 = 2.12 # lambda / 4
eps0 = 1.0  # WG air filled
eps_c = 2.2 # Teflon coax dielectric
eps_s = 1.00 # PCB substrate

class PEC(SubDomain):
    def inside(self, x, on_boundary):
        return on_boundary

class InputBC(SubDomain):
    def inside(self, x, on_boundary):
        return on_boundary and near(x[2], -h-lc, tol)

class OutputBC(SubDomain):
    def inside(self, x, on_boundary):
        r_b = sqrt(x[0] * x[0] + x[1] * x[1] + x[2] * x[2])
        r_c = sqrt(x[0] * x[0] + x[1] * x[1])
        return on_boundary and ((near(r_b, 2.0, 5e-2) and x[2] > 0.0) or (near(r_c, 2.0, 5e-2) and between(x[2], (-h, 0))))

class PMC(SubDomain):
    def inside(self, x, on_boundary):
        return on_boundary and (near(x[1], 0.0, tol) or near(x[0], 0.0, tol))

mesh = Mesh()
with XDMFFile("mesh.xdmf") as infile:
    infile.read(mesh)
mvc = MeshValueCollection("size_t", mesh, 3)

info(mesh)

# Mark boundaries
sub_domains = MeshFunction("size_t", mesh, mesh.topology().dim() - 1)
sub_domains.set_all(4)
pec = PEC()
pec.mark(sub_domains, 0)
in_port = InputBC()
in_port.mark(sub_domains, 1)
out_port = OutputBC()
out_port.mark(sub_domains, 2)
pmc = PMC()
pmc.mark(sub_domains, 3)
File("BoxSubDomains.pvd").write(sub_domains)

Dk = Expression('x[2] <= 0.0 + tol ? (x[2] <= -h? eps_c : eps_s) : eps0', degree = 0, tol = 1.0e-12, h = h, eps_s = eps_s, eps_c = eps_c, eps0 = eps0)  # Dielectric subdomains

# Set up function spaces
# For low order problem
cell = tetrahedron
ele_type = FiniteElement('N1curl', cell, 2) # H(curl) element for EM
V2 = FunctionSpace(mesh, MixedElement([ele_type, ele_type]))
V = FunctionSpace(mesh, ele_type)
u_r, u_i = TrialFunctions(V2)
v_r, v_i = TestFunctions(V2)

#surface integral definitions from boundaries
ds = Measure('ds', domain = mesh, subdomain_data = sub_domains)
# with source and sink terms
u0 = Constant((0.0, 0.0, 0.0)) #PEC definition
h_src = Expression(('-(x[1] - poffset) / (2.0 * pi * (pow((x[0]-qoffset), 2.0) + pow(x[1] - poffset,2.0)))', '(x[0]-qoffset) / (2.0 * pi *(pow((x[0]-qoffset),2.0) + pow(x[1] - poffset,2.0)))', 0.0), degree = 2, poffset = poffset, qoffset = qoffset)
e_src = Expression(('(x[0] - qoffset) / (2.0 * pi * (pow((x[0]-qoffset), 2.0) + pow(x[1] - poffset,2.0)))', '(x[1]-poffset) / (2.0 * pi *(pow((x[0]-qoffset),2.0) + pow(x[1] - poffset,2.0)))', 0.0), degree = 2, poffset = poffset, qoffset = qoffset)

#Boundary condition dictionary
boundary_conditions = {0: {'PEC' : u0},
                       1: {'InputBC': (h_src, eps_c)},
                       2: {'OutputBC': 1.0},
                       3: {'PMC': 0.0}}

n = FacetNormal(mesh)

#Build PEC boundary conditions for real and imaginary parts
bcs = []
for i in boundary_conditions:
    if 'PEC' in boundary_conditions[i]:
        bc = DirichletBC(V2.sub(0), boundary_conditions[i]['PEC'], sub_domains, i)
        bcs.append(bc)
        bc = DirichletBC(V2.sub(1), boundary_conditions[i]['PEC'], sub_domains, i)
        bcs.append(bc)

# Build input BC source term and loading term
integral_source = []
integrals_load =[]
for i in boundary_conditions:
    if 'InputBC' in boundary_conditions[i]:
        r, s = boundary_conditions[i]['InputBC']
        bb1 = 2.0 * (k0 * eta) * inner(v_i, cross(n, r)) * ds(i) #Factor of two from field equivalence principle
        integral_source.append(bb1)
        bb2 = inner(cross(n, v_i), cross(n, u_r)) * k0 * sqrt(Dk) * ds(i)
        integrals_load.append(bb2)
        bb2 = inner(-cross(n, v_r), cross(n, u_i)) * k0 * sqrt(Dk) * ds(i)
        integrals_load.append(bb2)

for i in boundary_conditions:
    if 'OutputBC' in boundary_conditions[i]:
        r = boundary_conditions[i]['OutputBC']
        bb2 = inner(cross(n, v_i), cross(n, u_r)) * k0 * ds(i)
        integrals_load.append(bb2)
        bb2 = inner(-cross(n, v_r), cross(n, u_i)) * k0 * ds(i)
        integrals_load.append(bb2)
# for PMC, do nothing. Natural BC.

a = (inner(curl(v_r), curl(u_r)) + inner(curl(v_i), curl(u_i)) - Dk * k0 * k0 * (inner(v_r, u_r) + inner(v_i, u_i))) * dx + sum(integrals_load)
L = sum(integral_source)

u1 = Function(V2)
vdim = u1.vector().size()
print ('Solution vector size = ',vdim)

solve(a == L, u1, bcs, solver_parameters = {'linear_solver' : 'mumps'}) 

u1_r, u1_i = u1.split(True)

fp = File("EField_r.pvd")
fp << u1_r
fp = File("EField_i.pvd")
fp << u1_i
fp = File('WaveFile.pvd')


H = interpolate(h_src, V) # Get input field
P =  assemble((-dot(u1_r,cross(curl(u1_i),n))+dot(u1_i,cross(curl(u1_r),n))) * ds(2))
P_refl = assemble((-dot(u1_i,cross(curl(u1_r), n)) + dot(u1_r, cross(curl(u1_i), n))) * ds(1))
P_inc = assemble((dot(H, H) * eta / (2.0 * sqrt(eps_c))) * ds(1))
print("Integrated power on rad boundary:", P/(2.0 * k0 * eta))
print("Incident power at port 1:", P_inc)
print("Integrated reflected power on port 1:", P_inc - P_refl / (2.0 * k0 * eta))
E = interpolate(e_src, V) # Incident E field
ccr = assemble(-dot(u1_r - E * (eta / sqrt(eps_c)), E * (eta / sqrt(eps_c))) * ds(1))
cci = assemble(dot(u1_i, E) * ds(1)) * eta / sqrt(eps_c)
cc = assemble(dot(E, E) * ds(1)) * eta * eta / eps_c
Zo = 50.0
rho = complex(ccr / cc, cci / cc)
print("Input port reflection coefficient: {0:<f}+j{1:<f}".format(rho.real, rho.imag))
Zin = Zo * (1.0 + rho) / (1.0 - rho)
print("Input port impedance: {0:<f} + j{1:<f}".format(Zin.real, Zin.imag))
Zl = Zo * (Zin - (1j) * Zo * tan(k0 * sqrt(eps_c) * lc)) / (Zo - (1j) * Zin * tan(k0 * sqrt(eps_c) * lc))
print("Antenna feedpoint impedance: {0:<f} + j{1:<f}".format(Zl.real, Zl.imag))

# Generate radiation pattern!
print("Generate radiation pattern.")
metadata = {"quadrature_degree": 6, "quadrature_scheme": "default"}
dsm = ds(metadata=metadata)
NumTheta = 25
NumPhi = 100
TwoPi = 6.2831853071
PiOverTwo = 1.5707963268
fp = open("Pattern1.txt", "w")
print("#Elevation    Azimuth     Pvert      Phoriz", file = fp)
# Reflection transformations
# PMC
JCx = Constant(((-1.0, 0.0, 0.0), (0.0, 1.0, 0.0), (0.0, 0.0, 1.0))) # PMC Reflection thru x = 0 plane
JCy = Constant(((1.0, 0.0, 0.0), (0.0, -1.0, 0.0), (0.0, 0.0, 1.0))) # PMC Reflection thru y = 0 plane
MCx = Constant(((1.0, 0.0, 0.0), (0.0, -1.0, 0.0), (0.0, 0.0, -1.0)))
MCy = Constant(((-1.0, 0.0, 0.0), (0.0, 1.0, 0.0), (0.0, 0.0, -1.0)))
# PEC ground plane
JCz = Constant(((-1.0, 0.0, 0.0), (0.0, -1.0, 0.0), (0.0, 0.0, 1.0))) #PEC reflection thru z = 0 plane
MCz = Constant(((1.0, 0.0, 0.0), (0.0, 1.0, 0.0), (0.0, 0.0, -1.0)))

# Surface currents on external boundary
M_r = -cross(n, u1_r)
M_i = -cross(n, u1_i)
J_r = -cross(n, curl(u1_i)) / (k0 * eta)
J_i = cross(n, curl(u1_r)) / (k0 * eta)

for m in range(NumTheta+1):
    theta = m * PiOverTwo / NumTheta
    print(" ", file = fp) # for Gnuplot
    for nn in range(NumPhi+1):
        L_r = [] #List objects for integrals
        L_i = []
        N_r = []
        N_i = []
# Do NFF transformation
        phi = nn * TwoPi / NumPhi
        rr = Expression(('sin(theta)*cos(phi)', 'sin(theta)*sin(phi)', 'cos(theta)'), degree = 3, phi = phi, theta = theta)
        rtheta = Expression(('cos(theta)*cos(phi)', 'cos(theta)*sin(phi)', '-sin(theta)'), degree = 3, phi = phi, theta = theta)
        rphi = Expression(('-sin(phi)', 'cos(phi)', '0.0'), degree = 3, phi = phi)
# Sum up all the image sources taking into account the proper symmetries
# First octant
        rp1 = Expression(('x[0]', 'x[1]', 'x[2]+0.15'), degree = 1)
        sr = J_r * cos(k0 * dot(rr, rp1)) - J_i * sin(k0 * dot(rr, rp1))
        si = J_i * cos(k0 * dot(rr, rp1)) + J_r * sin(k0 * dot(rr, rp1))
        N_r.append(sr)
        N_i.append(si)
        qr = M_r * cos(k0 * dot(rr, rp1)) - M_i * sin(k0 * dot(rr, rp1))
        qi = M_i * cos(k0 * dot(rr, rp1)) + M_r * sin(k0 * dot(rr, rp1))
        L_r.append(qr)
        L_i.append(qi)
# Second octant x < 0, y > 0, z > 0
        rp2 = Expression(('-x[0]', 'x[1]', 'x[2]+0.15'), degree = 1)
        sr = JCx * (J_r * cos(k0 * dot(rr, rp2)) - J_i * sin(k0 * dot(rr, rp2)))
        si = JCx * (J_i * cos(k0 * dot(rr, rp2)) + J_r * sin(k0 * dot(rr, rp2)))
        N_r.append(sr)
        N_i.append(si)
        qr = MCx * (M_r * cos(k0 * dot(rr, rp2)) - M_i * sin(k0 * dot(rr, rp2)))
        qi = MCx * (M_i * cos(k0 * dot(rr, rp2)) + M_r * sin(k0 * dot(rr, rp2)))
        L_r.append(qr)
        L_i.append(qi)
# third octant x < 0, y < 0, z > 0
        rp3 = Expression(('-x[0]', '-x[1]', 'x[2]+0.15'), degree = 1)
        sr = JCy * JCx * (J_r * cos(k0 * dot(rr, rp3)) - J_i * sin(k0 * dot(rr, rp3)))
        si = JCy * JCx * (J_i * cos(k0 * dot(rr, rp3)) + J_r * sin(k0 * dot(rr, rp3)))
        N_r.append(sr)
        N_i.append(si)
        qr = MCy * MCx * (M_r * cos(k0 * dot(rr, rp3)) - M_i * sin(k0 * dot(rr, rp3)))
        qi = MCy * MCx * (M_i * cos(k0 * dot(rr, rp3)) + M_r * sin(k0 * dot(rr, rp3)))
        L_r.append(qr)
        L_i.append(qi)
# fourth octant x > 0, y < 0, z > 0
        rp4 = Expression(('x[0]', '-x[1]', 'x[2]+0.15'), degree = 1)
        sr = JCy * (J_r * cos(k0 * dot(rr, rp4)) - J_i * sin(k0 * dot(rr, rp4)))
        si = JCy * (J_i * cos(k0 * dot(rr, rp4)) + J_r * sin(k0 * dot(rr, rp4)))
        N_r.append(sr)
        N_i.append(si)
        qr = MCy * (M_r * cos(k0 * dot(rr, rp4)) - M_i * sin(k0 * dot(rr, rp4)))
        qi = MCy * (M_i * cos(k0 * dot(rr, rp4)) + M_r * sin(k0 * dot(rr, rp4)))
        L_r.append(qr)
        L_i.append(qi)
# Fifth octant x > 0, y > 0, z < 0
        rp5 = Expression(('x[0]', 'x[1]', '-x[2]-0.15'), degree = 1)
        sr = JCz * (J_r * cos(k0 * dot(rr, rp5)) - J_i * sin(k0 * dot(rr, rp5)))
        si = JCz * (J_i * cos(k0 * dot(rr, rp5)) + J_r * sin(k0 * dot(rr, rp5)))
        N_r.append(sr)
        N_i.append(si)
        qr = MCz * (M_r * cos(k0 * dot(rr, rp5)) - M_i * sin(k0 * dot(rr, rp5)))
        qi = MCz * (M_i * cos(k0 * dot(rr, rp5)) + M_r * sin(k0 * dot(rr, rp5)))
        L_r.append(qr)
        L_i.append(qi)
# Sixth octant x < 0, y > 0, z < 0
        rp6 = Expression(('-x[0]', 'x[1]', '-x[2]-0.15'), degree = 1)
        sr = JCx * JCz * (J_r * cos(k0 * dot(rr, rp6)) - J_i * sin(k0 * dot(rr, rp6)))
        si = JCx * JCz * (J_i * cos(k0 * dot(rr, rp6)) + J_r * sin(k0 * dot(rr, rp6)))
        N_r.append(sr)
        N_i.append(si)
        qr = MCx * MCz * (M_r * cos(k0 * dot(rr, rp6)) - M_i * sin(k0 * dot(rr, rp6)))
        qi = MCx * MCz * (M_i * cos(k0 * dot(rr, rp6)) + M_r * sin(k0 * dot(rr, rp6)))
        L_r.append(qr)
        L_i.append(qi)
# seventh octant x < 0, y < 0, z < 0
        rp7 = Expression(('-x[0]', '-x[1]', '-x[2]-0.15'), degree = 1)
        sr = JCy * JCx * JCz * (J_r * cos(k0 * dot(rr, rp7)) - J_i * sin(k0 * dot(rr, rp7)))
        si = JCy * JCx * JCz * (J_i * cos(k0 * dot(rr, rp7)) + J_r * sin(k0 * dot(rr, rp7)))
        N_r.append(sr)
        N_i.append(si)
        qr = MCy * MCx * MCz * (M_r * cos(k0 * dot(rr, rp7)) - M_i * sin(k0 * dot(rr, rp7)))
        qi = MCy * MCx * MCz * (M_i * cos(k0 * dot(rr, rp7)) + M_r * sin(k0 * dot(rr, rp7)))
        L_r.append(qr)
        L_i.append(qi)
# Eighth octant x > 0, y < 0, z < 0
        rp8 = Expression(('x[0]', '-x[1]', '-x[2]-0.15'), degree = 1)
        sr = JCy * JCz * (J_r * cos(k0 * dot(rr, rp8)) - J_i * sin(k0 * dot(rr, rp8)))
        si = JCy * JCz * (J_i * cos(k0 * dot(rr, rp8)) + J_r * sin(k0 * dot(rr, rp8)))
        N_r.append(sr)
        N_i.append(si)
        qr = MCy * MCz * (M_r * cos(k0 * dot(rr, rp8)) - M_i * sin(k0 * dot(rr, rp8)))
        qi = MCy * MCz * (M_i * cos(k0 * dot(rr, rp8)) + M_r * sin(k0 * dot(rr, rp8)))
        L_r.append(qr)
        L_i.append(qi)
        
# Compute E_ff
        Et_i = -k0 * assemble((dot(sum(L_r), rphi) + eta * dot(sum(N_r), rtheta)) * dsm(2))
        Et_r = k0 * assemble((dot(sum(L_i), rphi) + eta * dot(sum(N_i), rtheta)) * dsm(2))
        Ep_i = k0 * assemble((dot(sum(L_r), rtheta) - eta * dot(sum(N_r), rphi)) * dsm(2))
        Ep_r = -k0 * assemble((dot(sum(L_i), rtheta) - eta * dot(sum(N_i), rphi)) * dsm(2))

# Compute magnitudes
        Gvert = (Et_r * Et_r + Et_i * Et_i) / (2.0 * TwoPi * eta * (P * 8.0 / (2.0 * k0 * eta)))
        Ghoriz = (Ep_r * Ep_r + Ep_i * Ep_i) / (2.0 * TwoPi * eta * (P * 8.0 / (2.0 * k0 * eta)))
        
        print(" {0:f} {1:f} {2:f} {3:f}".format(theta, phi, Gvert, Ghoriz))
        print(" {0:f} {1:f} {2:f} {3:f}".format(theta, phi, Gvert, Ghoriz), file = fp)

fp.close()

sys.exit(0)

the following error output was generated


Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.9d845ebb78414c21b44c5948d4223c69
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.1ea3da20491f4939b7e18cad6a548d90
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.22ee256ff8d84f9aa8806ce78e297be8
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.77c3ee07f5564859adca2fb69122b41b
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.12d7164e175d4d95a06f62c363f12862
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.84da7d060acf4d219d141a2728b99794
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.cab8739f31634f0aaa73b163f0036565
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.366f9aee54d744f5963232d3cc26c57f
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.aacc4facc28344279848984e761fe3f4
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.398f4f5818ab49c7bb8dceeeffa8ab45
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.f83de5e4b52e4d82a6096407f4bb8d06
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.993f2714931f43c1afd7f60424251de5
dst: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt
backup: /home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old
Backup file exists, overwriting.
Traceback (most recent call last):
  File "/usr/lib/python3.6/shutil.py", line 550, in move
    os.rename(src, real_dst)
FileNotFoundError: [Errno 2] No such file or directory: '/home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt' -> '/home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt.old.priv.311658593283635178770042367730589103107'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/dolfin/jit/jit.py", line 167, in compile_class
    mpi_comm=mpi_comm)
  File "/usr/lib/python3/dist-packages/dolfin/jit/jit.py", line 76, in mpi_jit
    output = local_jit(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/dolfin/jit/jit.py", line 103, in dijitso_jit
    return dijitso.jit(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/dijitso/jit.py", line 178, in jit
    params)
  File "/usr/lib/python3/dist-packages/dijitso/build.py", line 194, in build_shared_library
    store_textfile(log_filename, log_contents)
  File "/usr/lib/python3/dist-packages/dijitso/system.py", line 226, in store_textfile
    lockfree_move_file(tmp_filename, filename)
  File "/usr/lib/python3/dist-packages/dijitso/system.py", line 248, in lockfree_move_file
    return _lockfree_move_file(src, dst, False)
  File "/usr/lib/python3/dist-packages/dijitso/system.py", line 282, in _lockfree_move_file
    _lockfree_move_file(dst, backup, True)
  File "/usr/lib/python3/dist-packages/dijitso/system.py", line 294, in _lockfree_move_file
    move_file(src, priv(ui))
  File "/usr/lib/python3/dist-packages/dijitso/system.py", line 234, in move_file
    shutil.move(srcfilename, dstfilename)
  File "/usr/lib/python3.6/shutil.py", line 564, in move
    copy_function(src, real_dst)
  File "/usr/lib/python3.6/shutil.py", line 263, in copy2
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/usr/lib/python3.6/shutil.py", line 120, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/home/bill/.cache/dijitso/log/dolfin_expression_21f84f64dc101fdf0d01518debdaa765.txt'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "Monopole2.py", line 209, in <module>
    rtheta = Expression(('cos(theta)*cos(phi)', 'cos(theta)*sin(phi)', '-sin(theta)'), degree = 3, phi = phi, theta = theta)
  File "/usr/lib/python3/dist-packages/dolfin/function/expression.py", line 400, in __init__
    self._cpp_object = jit.compile_expression(cpp_code, params)
  File "/usr/lib/python3/dist-packages/dolfin/function/jit.py", line 158, in compile_expression
    expression = compile_class(cpp_data, mpi_comm=mpi_comm)
  File "/usr/lib/python3/dist-packages/dolfin/jit/jit.py", line 170, in compile_class
    raise RuntimeError("Unable to compile C++ code with dijitso")
RuntimeError: Unable to compile C++ code with dijitso
^Cbill@Oliver:~/Cluster$ 

bill@Oliver:~/Cluster$ 

It mostly worked! Doing the final steps, the simulation died with what looks like something to do with cache file storage writing and/or retrieval. What does this mean?

When I run the code on the head node only, using

mpirun --mca orte_base_help_aggregate 0 --mca btl tcp,self  -n 8 --host Oliver:8 python3 Monopole2.py

it works correctly, generating no errors.

I’ve run into similar problems before running FEniCS on clusters. My workaround was to run the code once in serial with a small mesh, to generate the cached files, then run the same code with the real mesh in parallel, which only needs to read from the cache (assuming all variational forms are specified in a way that does not require re-compilation for the larger parallel run).

In fact, once it runs correctly once on a single node, it works on all of them.

This could get inconvenient when I vary the operating frequency (k0) to generate frequency response, as recompiling is sometimes needed.

Anyone in the developer world have an idea why there is this behavior? (I will add that I am deeply impressed with the functionality in the Fenics code and appreciate how difficult it is to make everything work together smoothly!) Thanks for the reply @kamensky. You are truly one of the ‘go to’ experts on all things fenics!

Note that if you wrap all numerical values as Constants, e.g.,

k0 = Constant(2.12)

then you can change their values without re-compiling.

Excellent! Thanks for that.