In the last months, I was trying to install FEniCS 2018.1.0 in my personal computer and in a cluster. After solving several problems during the installation process, I got to install FEniCS in my computer and I checked it works properly with some codes. However, the same code does not work in the cluster, even when the installation process is supposed to be completed.
I realized the program does not work properly when I tried to run my simulations in the cluster using more than one processor. During the simulation, it seems the program writes some temporal files in “.cache/”. When we are using only one processor, it is the only one which is reading and writing on those files and everything works correctly. When we try to increase the number of processors, each of them try to create their own temporal files, but they find those files that have been created by other processors. For this reason, it tries to back up the existing files, to remove them and to create its own temporal files. Due to this behavior, we can get two different results:
-
In some simulations, the different processors work coordinately and the simulation seems to work. However, the program is terribly slow, because it is making copies and moving files all the time. Moreover, I do not rely on the results, although I can not check them because the program is very slow.
-
In other simulations, when one of the processors is copying those temporal files, another processor try to access to one of those files, it does not find the right file and the program prints the error that is shown below (the error that is shown in that topic is an extract of the error that is printed):
Moving new file over differing existing file:
src: /tmp/tmp983p99_d/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
dst: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
backup: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz.old
Moving new file over differing existing file:
src: /tmp/tmp3fmi3u_z/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
dst: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
backup: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /tmp/tmpd9tfdgrg/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
dst: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
backup: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /tmp/tmp_7qcgtkr/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
dst: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
backup: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /tmp/tmp3fl3r0x6/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
dst: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
backup: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz.old
Backup file exists, overwriting.
Moving new file over differing existing file:
src: /tmp/tmpj5jkkqik/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
dst: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz
backup: /home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz.old
Backup file exists, overwriting.
Traceback (most recent call last):
File “/apps/cent7/anaconda/5.3.1-py37/lib/python3.7/shutil.py”, line 557, in move
os.rename(src, real_dst)
FileNotFoundError: [Errno 2] No such file or directory: ‘/home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz’ -> ‘/home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz.old.priv.265268745244341306884196712515058776583’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “Elasticidad2D-Fractura.py”, line 165, in
V = FunctionSpace(mesh, U)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/dolfin/function/functionspace.py”, line 31, in init
*self._init_from_ufl(*args, *kwargs)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/dolfin/function/functionspace.py”, line 43, in _init_from_ufl
mpi_comm=mesh.mpi_comm())
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/dolfin/jit/jit.py”, line 47, in mpi_jit
*return local_jit(*args, *kwargs)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/dolfin/jit/jit.py”, line 97, in ffc_jit
return ffc.jit(ufl_form, parameters=p)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/ffc/jitcompiler.py”, line 217, in jit
module = jit_build(ufl_object, module_name, parameters)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/ffc/jitcompiler.py”, line 133, in jit_build
generate=jit_generate)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/dijitso/jit.py”, line 165, in jit
header, source, dependencies = generate(jitable, name, signature, params[“generator”])
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/ffc/jitcompiler.py”, line 76, in jit_generate
dep_module_name = jit(dep, parameters, indirect=True)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/ffc/jitcompiler.py”, line 217, in jit
module = jit_build(ufl_object, module_name, parameters)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/ffc/jitcompiler.py”, line 133, in jit_build
generate=jit_generate)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/dijitso/jit.py”, line 178, in jit
params)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/dijitso/build.py”, line 181, in build_shared_library
lockfree_move_file(temp_src_filename, src_filename)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/dijitso/system.py”, line 248, in lockfree_move_file
return _lockfree_move_file(src, dst, False)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/dijitso/system.py”, line 275, in _lockfree_move_file
_lockfree_move_file(dst, backup, True)
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/dijitso/system.py”, line 287, in _lockfree_move_file
move_file(src, priv(ui))
File “/home/acastele/.conda/envs/cent7/5.3.1-py37/FEniCS.env/lib/python3.7/site-packages/dijitso/system.py”, line 234, in move_file
shutil.move(srcfilename, dstfilename)
File “/apps/cent7/anaconda/5.3.1-py37/lib/python3.7/shutil.py”, line 571, in move
copy_function(src, real_dst)
File “/apps/cent7/anaconda/5.3.1-py37/lib/python3.7/shutil.py”, line 257, in copy2
copyfile(src, dst, follow_symlinks=follow_symlinks)
File “/apps/cent7/anaconda/5.3.1-py37/lib/python3.7/shutil.py”, line 120, in copyfile
with open(src, ‘rb’) as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: ‘/home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz’
Traceback (most recent call last):
File “/apps/cent7/anaconda/5.3.1-py37/lib/python3.7/shutil.py”, line 557, in move
os.rename(src, real_dst)
FileNotFoundError: [Errno 2] No such file or directory: ‘/home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz’ -> ‘/home/acastele/.cache/dijitso/src/ffc_element_ee3c68ce6482b04838050db8ba0e96b7572c5935.cpp.gz.old.priv.221521007679093248207297234337480157215’
The behavior of the program depends on the number of processors. However, if we increase the number of processors, it will be easier to get the error.
Below, I show one of the codes that works in my personal computer, but not in parallel in the cluster.
from dolfin import *
import numpy as np
Define numerical simulation specific parameters
Define geometrical parameters
Lx = 1.000e+00
Ly = 1.000e+00
Define elasticity theory parameters
lame_lambda = 1.2115e+05
lame_mu = 8.0770e+04
Define fracture model parameters
Gc = 2.700e+00
Define movement parameters
umax = 2.500e-02
Define temporal parameters
time_ini_t = 0.000e+00
time_fin_t = 1.000e+00
time_dt = 1.000e-04
num_steps = 10000
Create mesh
Define empty mesh
mesh = Mesh()
Define mesh editor
editor = MeshEditor()
Open mesh editor
editor.open(mesh, "quadrilateral", 2, 2)
Set numerical simulation discretization options
Nx = 500
Ny = 500
p = 2
Define number of vertices
editor.init_vertices( ( Nx + 1 ) * ( Ny + 1 ) )
Define number of cells
editor.init_cells( Nx * Ny )
Define list of vertices for mesh
for j in range( Ny + 1 ):
for i in range( Nx + 1 ):
vertex_index = ( ( ( Nx + 1 ) * ( j ) ) + ( i ) )
vertex_x = ( ( i ) * ( Lx / Nx ) )
vertex_y = ( ( j ) * ( Ly / Ny ) )
editor.add_vertex( vertex_index, [ vertex_x, vertex_y ] )
Define list of cells for mesh
for j in range( Ny ):
for i in range ( Nx ):
cell_index = ( ( Nx * j ) + ( i ) )
cell_vertex_1 = ( ( ( Nx + 1 ) * ( j ) ) + ( i ) )
cell_vertex_2 = ( ( ( Nx + 1 ) * ( j ) ) + ( i + 1 ) )
cell_vertex_3 = ( ( ( Nx + 1 ) * ( j + 1 ) ) + ( i ) )
cell_vertex_4 = ( ( ( Nx + 1 ) * ( j + 1 ) ) + ( i + 1 ) )
editor.add_cell( cell_index, [ cell_vertex_1, cell_vertex_2, cell_vertex_3, cell_vertex_4 ] )
Close mesh editor
editor.close()
Define phase field parameters
l = ( 2 * mesh.hmin() )
Class for interfacing with the Newton solver
class Displacements_Equation(NonlinearProblem):
def __init__(self, L, a, bc):
NonlinearProblem.__init__(self)
self.L = L
self.a = a
self.bc = bc
def F(self, b, x):
assemble(self.L, tensor=b)
self.bc[0].apply(b, x)
self.bc[1].apply(b, x)
def J(self, A, x):
assemble(self.a, tensor=A)
self.bc[0].apply(A)
self.bc[1].apply(A)
class PhaseField_Equation(NonlinearProblem):
def __init__(self, L, a, bc):
NonlinearProblem.__init__(self)
self.L = L
self.a = a
self.bc = bc
def F(self, b, x):
assemble(self.L, tensor=b)
self.bc.apply(b, x)
def J(self, A, x):
assemble(self.a, tensor=A)
self.bc.apply(A)
Define function spaces
U = VectorElement("Lagrange", mesh.ufl_cell(), p)
M = FiniteElement("Lagrange", mesh.ufl_cell(), p)
P = FiniteElement("Lagrange", mesh.ufl_cell(), 1)
V = FunctionSpace(mesh, U)
N = FunctionSpace(mesh, M)
Q = FunctionSpace(mesh, P)
Define trial and test functions
du = TrialFunction(V)
dm = TrialFunction(N)
v = TestFunction(V)
n = TestFunction(N)
Define functions for solutions
u = Function(V)
m = Function(N)
p = Function(Q)
unew = Function(V)
mold = Function(N)
H = Function(Q)
Hold = Function(Q)
Define boundary conditions
Define values for boundary conditions
Value_Fixed = Constant((0.0, 0.0))
Value_Movement = Expression(("umax*t", "0.0"), degree = 1, umax = umax, t = 0.0)
Value_PhaseField = Constant(1.0)
Define boundaries for boundary conditions
class Boundaries_Bottom(SubDomain):
def inside(self, x, on_boundary):
return abs( x[1] ) < 1.0e-10 and on_boundary
class Boundaries_Top(SubDomain):
def inside(self, x, on_boundary):
return abs( x[1] - Ly ) < 1.0e-10 and on_boundary
class Boundaries_PhaseField(SubDomain):
def inside(self, x, on_boundary):
return x[0] <= Lx / 2. and abs( x[1] - Ly / 2. ) < 2.5e-03
Define boundary conditions
BC_Fixed_Bottom = DirichletBC(V, Value_Fixed, Boundaries_Bottom())
BC_Movement_Top = DirichletBC(V, Value_Movement, Boundaries_Top())
BC_Displacements = [BC_Fixed_Bottom, BC_Movement_Top]
BC_PhaseField = DirichletBC(N, Value_PhaseField, Boundaries_PhaseField())
Define expressions used in variational forms
def Psi0(u):
return ( ( lame_lambda / 2. ) * ( ( grad(u)[0, 0] + grad(u)[1, 1] ) ** ( 2. ) ) ) \
+ ( ( lame_mu ) * ( ( ( grad(u)[0, 0] ) ** ( 2. ) ) + ( ( grad(u)[1, 1] ) ** ( 2. ) ) + ( grad(u)[0, 1] * grad(u)[1, 0] ) ) ) \
+ ( ( lame_mu / 2. ) * ( ( ( grad(u)[0, 1] ) ** ( 2. ) ) + ( ( grad(u)[1, 0] ) ** ( 2. ) ) ) )
Define variational problem for time step
Equation_Displacements = ( ( ( ( 1. - mold ) ** ( 2 ) ) * ( inner( ( ( ( lame_lambda ) * ( div(u) ) * ( Identity(2) ) ) + ( ( 2. ) * ( lame_mu ) * ( ( 1. / 2. ) * ( ( grad(u) ) + ( grad(u).T ) ) ) ) ), ( ( 1. / 2. ) * ( ( grad(v) ) + ( grad(v).T ) ) ) ) ) ) * ( dx ) )
Equation_PhaseField_1 = ( ( ( ( l ) ** ( 2 ) ) * ( inner( ( grad(m) ), ( grad(n) ) ) ) ) * ( dx ) )
Equation_PhaseField_2 = ( ( ( ( 1. ) + ( ( ( 2. * l ) / ( Gc ) ) * ( H ) ) ) * ( inner( ( m ), ( n ) ) ) ) * ( dx ) )
Equation_PhaseField_3 = ( ( ( ( ( 2. * l ) / ( Gc ) ) * ( H ) ) * ( n ) ) * ( dx ) )
Equation_PhaseField = ( Equation_PhaseField_1 + Equation_PhaseField_2 - Equation_PhaseField_3 )
Compute directional derivative (Jacobian)
Jacobian_Displacements = derivative(Equation_Displacements, u, du)
Jacobian_PhaseField = derivative(Equation_PhaseField, m, dm)
Create nonlinear problem
Problem_Displacements = Displacements_Equation(Equation_Displacements, Jacobian_Displacements, BC_Displacements)
Problem_PhaseField = PhaseField_Equation(Equation_PhaseField, Jacobian_PhaseField, BC_PhaseField)
Create Newton solver
Displacements_solver = NewtonSolver()
Displacements_solver.parameters["absolute_tolerance"] = 1.000e-50
Displacements_solver.parameters["convergence_criterion"] = "residual"
Displacements_solver.parameters["error_on_nonconvergence"] = True
Displacements_solver.parameters["linear_solver"] = "cg"
Displacements_solver.parameters["maximum_iterations"] = 100
Displacements_solver.parameters["preconditioner"] = "hypre_euclid"
Displacements_solver.parameters["relative_tolerance"] = 1.000e-05
Displacements_solver.parameters["report"] = True
Displacements_solver.parameters["krylov_solver"]["absolute_tolerance"] = 1.000e-50
Displacements_solver.parameters["krylov_solver"]["error_on_nonconvergence"] = True
Displacements_solver.parameters["krylov_solver"]["maximum_iterations"] = 50000
Displacements_solver.parameters["krylov_solver"]["monitor_convergence"] = False
Displacements_solver.parameters["krylov_solver"]["relative_tolerance"] = 1.000e-05
Displacements_solver.parameters["krylov_solver"]["report"] = True
Displacements_solver.parameters["lu_solver"]["report"] = True
Displacements_solver.parameters["lu_solver"]["symmetric"] = True
Displacements_solver.parameters["lu_solver"]["verbose"] = True
info( Displacements_solver.parameters, True )
PhaseField_solver = NewtonSolver()
PhaseField_solver.parameters["absolute_tolerance"] = 1.000e-50
PhaseField_solver.parameters["convergence_criterion"] = "residual"
PhaseField_solver.parameters["error_on_nonconvergence"] = True
PhaseField_solver.parameters["linear_solver"] = "cg"
PhaseField_solver.parameters["maximum_iterations"] = 100
PhaseField_solver.parameters["preconditioner"] = "hypre_euclid"
PhaseField_solver.parameters["relative_tolerance"] = 1.000e-05
PhaseField_solver.parameters["report"] = True
PhaseField_solver.parameters["krylov_solver"]["absolute_tolerance"] = 1.000e-50
PhaseField_solver.parameters["krylov_solver"]["error_on_nonconvergence"] = True
PhaseField_solver.parameters["krylov_solver"]["maximum_iterations"] = 50000
PhaseField_solver.parameters["krylov_solver"]["monitor_convergence"] = False
PhaseField_solver.parameters["krylov_solver"]["relative_tolerance"] = 1.000e-05
PhaseField_solver.parameters["krylov_solver"]["report"] = True
PhaseField_solver.parameters["lu_solver"]["report"] = True
PhaseField_solver.parameters["lu_solver"]["symmetric"] = True
PhaseField_solver.parameters["lu_solver"]["verbose"] = True
info( PhaseField_solver.parameters, True )
Set form compiler options
parameters["form_compiler"]["cpp_optimize"] = True
parameters["form_compiler"]["optimize"] = True
info( parameters, True )
Define time - stepping
for i in range(num_steps):
Update current time
time_t = ( ( time_ini_t ) + ( i * time_dt ) )
Update current time for boundary conditions
Value_Movement.t = time_t
Solve variational problem for time step (step 1)
print (" \n Solving Displacements Equation \n ")
Displacements_solver.solve(Problem_Displacements, u.vector())
unew.assign(u)
Update maximum strain energy
Hn = project(Psi0(unew), Q)
zz = np.maximum(Hn.vector().get_local(), Hold.vector().get_local())
p.vector().set_local(zz)
assign(H, p)
Solve variational problem for time step (step 2)
print (" \n Solving PhaseField Equation \n ")
PhaseField_solver.solve(Problem_PhaseField, m.vector())
mold.assign(m)
Update historical maximum strain energy
Hn = project(Psi0(unew), Q)
zz = np.maximum(Hn.vector().get_local(), Hold.vector().get_local())
p.vector().set_local(zz)
assign(Hold, p)
Save solution to file in VTK format
if ( ( i % 400 ) == ( 0 ) ):
print ( " \n\n Simulation Time: ", time_t, " \n\n " )
u.rename( "Displacements", "Displacements" )
m.rename( "PhaseField", "PhaseField" )
vtkfile_Displacements = File( "/scratch/brown/acastele/simulations/Elasticity2D-FractureModel-Displacements-" + str(i) + ".pvd" )
vtkfile_Displacements << ( u, time_t )
vtkfile_PhaseField = File( "/scratch/brown/acastele/simulations/Elasticity2D-FractureModel-PhaseField-" + str(i) + ".pvd" )
vtkfile_PhaseField << ( m, time_t )